Abstract
Edgelike and linelike features result from spatial phase congruence, the local phase agreement between harmonic components of a spatial waveform. Psychophysical observations and models of early visual processing suggest that human visual feature detectors are specialized for edgelike and linelike phase congruence. To test whether primary visual cortex (V1) neurons account for such specificity, we made tetrode recordings in anesthetized macaque monkeys. Stimuli were drifting equalenergy compound gratings composed of four sinusoidal components. Eight congruence phases (onedimensional features) were tested, including linelike and edgelike waveforms. Many of the 137 single V1 neurons (recorded at 45 sites) could reliably signal phase congruence by any of several response measures. Across neurons, the preferred spatial feature had only a modest bias for linelike waveforms. Informationtheoretic analysis showed that congruence phase was temporally encoded in the frequency band present in the stimuli. The most sensitive neurons had feature discrimination thresholds that approached psychophysical levels, but typical neurons were substantially less sensitive. In single V1 neurons, feature discrimination exhibited various dependences on the congruence phase of the reference waveform. Simple cells were overrepresented among the most sensitive neurons and on average carried twice as much feature information as complex cells. However, the distribution of the indices of optimal tuning and discrimination of relative phase was indistinguishable in simple and complex cells. Our results suggest that phasesensitive pooling of responses is required to account for human psychophysical performance, although variation in feature selectivity among nearby neurons is considerable.
 spatial feature detection
 feature discrimination
 phaseselective nonlinearity
 congruence phase
 edge
 line
 macaque
 primary visual cortex
 transinformation
 simple and complex cells
Psychophysical studies of spatial vision have demonstrated the importance of spatial phase information in shape perception (Burton and Moorhead, 1981;Oppenheim and Lim, 1981), texture discrimination (Klein and Tyler, 1986; Rentschler et al., 1988), and contour integration (Field et al., 1993; Kovacs and Julesz, 1993; Dakin and Hess, 1999). Edgelike and linelike features are examples of salient spatial cues defined by phase. Detection thresholds for compound gratings (Tolhurst, 1972; Shapley and Tolhurst, 1973; Tolhurst and Dealy, 1975), and the discrimination sensitivity for the relative spatial phase of harmonic components of compound gratings (Burr, 1980;Badcock, 1984a, b; Burr et al., 1989) as well as the phase dependence in monocular rivalry (Atkinson and Campbell, 1974) and afterimages (Georgeson and Turner, 1985), are all consistent with the existence of two classes of feature detectors, one tuned to edgelike and the other to linelike waveforms. Human discrimination of relative phase requires contrasts markedly above detection threshold, (Nachmias and Weber, 1975), indicating that the mechanism underlying discrimination is nonlinear.
The prevailing view of early vision posits localized and spectrally bandlimited image analysis at multiple spatial scales. The privileged role of lines and edges as features in human vision is posited to derive from phase congruence (Morrone and Burr, 1988). This is illustrated in Figure 1. Phase congruence denotes a local phenomenon whereby harmonic components across spatial scales share a common phase and, consequently, reinforce that phase by summation. Edges and lines are examples of salient phase congruence across spatial scales. Sensitivity to phase congruence requires the existence of local mechanisms that compare relative phase information across multiple scales.
Theoretical work also motivates these experiments. The nonlinear feature detector model developed by Burr and Morrone (1992) derives an edge versus line feature dichotomy from the orthogonal odd versus even symmetry of the spatial function of these features' crosssection. The first stage of their model consists of even/odd symmetrysensitive linear spatial filters, idealized cortical simple cells. The second stage, intended to represent complex cells, implements a local energy operator: squared filter outputs are summed within a single orientation band in a phasespecific manner. At the final stage, features are identified by a winnertakeall localization of maxima in the map of feature energy. The model of Burr and Morrone (1992) makes successful qualitative predictions of illusions, quantitative predictions of thresholds, and testable predictions for the roles of simple and complex cells in feature detection and discrimination.
Our paper expands on earlier studies that assayed with spatial compound gratings the feature (relative phase) selectivity of single neurons in the primary visual cortex (V1) of cat (De Valois and Tootell, 1983; Levitt et al., 1990) and monkey (Pollen et al., 1988). We found that nonlinearities contributed to feature coding in the entire frequency band of the stimulus. Most response harmonics, but not the DC, were tuned to features. Preferred features were rather evenly distributed in V1 (edges or lines were not overtly overrepresented) and also varied within local clusters. Feature discrimination threshold in the most sensitive V1 neurons approached human psychophysical thresholds. These statements held for both simple and complex cells. The pattern of feature tuning and discrimination observed in V1 neurons puts new constraints on our models of cortical circuits.
Parts of this paper have been published previously at the 1998 and 1999 Annual Meeting of The Society for Neuroscience (Mechler et al., 1998a, 1999).
MATERIALS AND METHODS
Physiological preparation. Standard acute preparation techniques were used for electrophysiological recordings from single units in the V1 of the primate (cynomolgus monkeys, Macaca fascicularis). All procedures were in accordance with institutional and National Institutes of Health guidelines for the care and experimental use of animals. Some details of the techniques have been given earlier (Mechler et al., 1998b).
Experiments were performed on 14 adult animals, weighing 3–4.5 kg. Before surgery, animals were given atropine (0.1 mg/kg, i.m.) and then anesthetized with ketamine (10 mg/kg, i.m.; Ketaset, Fort Dodge, IA). Anesthesia was maintained with sufentanil citrate (3–6 μg · kg^{−1} · hr^{−1}, i.v.; Sufenta, Janssen, Titusville, NJ), and muscle paralysis was induced (after all surgical procedures) and maintained with pancuronium bromide (0.1 mg · kg^{−1} · hr^{−1}, i.v.). Dexamethasone (1 mg/kg, i.m.) and gentamicin (5 mg/kg, i.m.) were given to help prevent the development of cerebral edema and infection, respectively. The animal was ventilated through an endotracheal tube. Heart rate, EKG, arterial blood pressure, and endtidal CO_{2} were continuously monitored with a Model 78354A HewlettPackard Patient Monitor and kept in the normal physiological range. Core body temperature was maintained between 37 and 38°C using a thermostatically controlled heating pad. The EEG was obtained from frontal leads and monitored on an oscilloscope.
A limited unilateral craniotomy to expose the primary visual cortex was made overlying and posterior to the lunate sulcus (the HorsleyClarke stereotaxic coordinates were typically 14–16 mm posterior and 14–16 mm lateral). A 1–2 mm durotomy was made for the recording electrode, which was stabilized after insertion by agarose gel.
Extracellular recording. Spike responses of single units were recorded extracellularly. We used either traditional glasscoated tungsten microelectrodes (single tip; typical resistance 2 MΩ) (Merrill and Ainsworth, 1972; Ainsworth et al., 1977), or quartzcoated platinumtungsten fibers tetrodes (Thomas Recording, Giessen, Germany). Tetrodes had a conical tip, with four contacts of ∼1 MΩ each, ∼25 μm apart: one at the apex and three arranged in radial symmetry on the conical surface. A stepper motor advanced either type of electrode in 1 μm steps.
The signals from the electrode or tetrode channels were passed through a unity gain (for the tetrode, multichannel) differential headstage amplifier (NB Labs, Denison, TX, or NeuraLynx, Tucson, AZ), and then further amplified and filtered (0.3–6 kHz passband, NeuraLynx eightchannel differential amplifier). Analog candidate spike waveforms, as detected by a threshold criterion, were digitized at 25 kHz within a short (∼1.2 msec) temporal window containing the peak amplitude, and then recorded on computer disk (Discovery software, DataWave Technologies, Longmont, CO). Multiple single units were isolated by cluster analysis of spike waveforms initially performed online (Autocut, DataWave Technologies), then offline [custom software (Reich, 2001)]. Isolation criteria included stability of principal components of spike waveforms and a 1.2 msec minimum interspike interval consistent with a physiologic refractory period. Spike times for further data analysis were identified offline to 0.1 msec, the accuracy to which the clocks of the recording computer and the stimulus generator were synchronized.
Histology and laminar assignment of recording sites.Experiments lasted for 4–5 d, at the end of which the animal was killed by infusion of a lethal dose of methohexital (Brevital; Eli Lilly & Co., Indianapolis, IN). After transcardiac perfusion with 4% paraformaldehyde in PBS, a block of the occipital cortex containing the penetration was saved for histological reconstruction of the electrode track. The block was cut in 40μmthick parasagittal sections, approximately parallel with the plane of the electrode penetration. Lesioned landmarks and fluorescent tracing aided track reconstruction. Electrolytic lesions (5 μA × 5 sec, electrode positive) were made, on withdrawal after recording was completed, at two or more points along all the tracks made with an Ainsworth single electrode, and on some tracks made with tetrodes. Fluorescent fulltrack tracing was made with the lipophilic dye Dil (D282; Molecular Probes, Eugene, OR). The dye, applied in a thin coat on the tetrode tip before penetration, left a ∼40 to 200μmwide trace from entry to the point of deepest penetration. These traces were easily identified in fluorescent micrographs prepared from sections before Nissl staining. In the same sections, the laminar boundaries were identified from the overlaid light micrographs of the Nissl density taken after Nissl staining. Lesions were also best identified on the Nisslstained sections. Laminar positions of the recording sites were estimated relative to the pattern of Nissl density along the reconstructed electrode track after correction for tissue shrinkage. With this method we successfully identified the laminar position of twothirds of the recording sites. Sites near a laminar boundary within the precision of reconstruction were classified as located in either lamina across the boundary. However, even with good histology, occasionally landmark positions could not be found or remained ambiguous, and laminar positions were either not assigned to recording sites or could only be classified in one of three gross divisions (granular, supragranular, or infragranular layers).
Optics. The eyes were treated with antiinflammatory (Ocufen) and antibacterial (neomycin) ophthalmic solutions. Pupils were dilated with topical application of 1% atropine sulfate (Atrosulf1; Optics Laboratories Co., Fairton, NJ) and covered with gaspermeable contact lenses (Metro Optics Inc., Houston, TX) under eyelids retracted with 60 chromic gut sutures. Artificial pupils (2 mm) and corrective lenses were used to focus the stimulus on the retina. Optical correction was estimated by retinoscopy and then refined by optimizing responses of isolated single units to high spatial frequency visual stimuli.
Visual stimulation. Foveae were mapped on a tangent board by backprojection with an ophthalmoscope. The receptive fields of isolated neurons were mapped on the same board with a laser. The standard simple/complex classification, based on the modulation ratio, was used: if the fundamental of the response to a drifting grating of near optimal spatial parameters was larger than the DC component (after subtraction of the maintained rate of firing), then the cell was cast as simple, and complex otherwise (Movshon et al., 1978b;De Valois et al., 1982; Skottun et al., 1991).
Visual stimuli were generated by a special purpose stimulus generator (Milkman et al., 1978, 1980) under the control of a PDP11/93 computer and displayed on a Tektronix 608 monochrome oscilloscope (green phosphor; 150 cd/m^{2} mean luminance; 270.32 Hz frame refresh). The luminance of the display was linearized with lookup tables in the range of 0–300 cd/m^{2}. At the 114 cm viewing distance of the animal, the stimuli appeared in a 4° circular aperture on dark background.
After isolation of single units, their receptive fields were characterized in a standard way using drifting sine gratings: tuning was measured first for orientation, then for spatial frequency, and finally for temporal frequency, each parameter optimized for subsequent tuning measurements. The contrast response function was measured using the optimal sine grating. When multiple single units were simultaneously isolated with tetrodes, receptivefield characterization was always done for the most responsive unit, and often for a second unit. For many neurons, the receptive field was also characterized with pseudorandom blackandwhite checkerboards modulated by long (2^{12}1 frames) binary msequences at 67.58 Hz. Our implementation of msequence stimuli and associated analysis procedures have been described in detail previously (Victor, 1992;Reid et al., 1997; Reich et al., 2000).
Compound gratings. In our experiments, 1D gratings were drifting at or near the optimal orientation and direction for the V1 neurons. With the spatial origin centered on the display, the spatiotemporal light variation ΔI(x,t) around a spatiotemporal mean intensity I
_{0} in a single drifting sine grating is described, in cosine formulation for convenience, by:
Each of our compoundgrating stimuli is constructed from four of these singlegrating harmonic components. We use a superposition of odd harmonics. That is, the mth component grating is chosen to have a frequency equal to 2m1 times the fundamental. Consequently, the light variation around the mean intensity in themth component,S
_{m}(x,t), is given by:
Note that the phase parameter φ specifies the shape of each compound grating. As the phase parameter increases from 0 to π, the compound waveform smoothly varies, from linelike (at φ = 0), to edgelike (at φ = π), and then back to linelike (via a different sequence of waveforms). This sequence of waveforms is then repeated as φ varies along the [π,2π) interval. Note that a waveform constructed with a particular value of φ is shifted by half a period (either in time or in space) when φ is replaced by φ + π, and thus does not produce new stimuli. In summary, by varying a single phaseparameter on just half the circle, we create a “feature space” of onedimensional (1D) equal energy compound gratings. We call the corresponding parameter space the “phase circle,” keeping in mind that it comprises the periodic continuation of the [0,π) interval. In Figure 2 b, this feature space is illustrated with the eight equally spaced samples around the phase circle that we used in these experiments.
Note that although the edgelike combination of an infinite number of sine components is convergent (because it is the Fourier series of an edge; see Fig. 2
a) the infinite series does not converge for any other phase congruence. Consequently, with the exception of the edgelike stimulus, the peak (Michelson) contrast of each compound waveform would grow without limit, albeit slowly, as additional oddharmonic components were added. However, this does not lead to any practical difficulties, because we use only a finite set of gratings for all phase combinations. For a fixed set of components, the Michelson contrast in our feature space decreases monotonically (as a cosine function of congruence phase) from line to edge in either direction on the phase circle. The Michelson contrast is largest for the linelike waveform (congruence phase φ = 0), the contrast of which at peak is
Data analysis. Offline data analysis was performed in the Matlab programming environment using custom software. In general, fast Fourier transforms were used whenever Fourier analysis is mentioned. The details of the information analysis based on Fourier metrics have been given previously (Mechler et al., 1998b). Matlab toolbox functions, as well as custom programs, were used to perform tests of statistical significance. Specifics of each data analysis will accompany the description of the corresponding results.
RESULTS
Data were obtained from V1 neurons with parafoveal receptive fields (centered at 2–5° eccentricity). Following convention, we used the modulation ratio (see Materials and Methods) for the classification of V1 neurons: if the modulation ratio exceeded 1.0, neurons were classified as simple cells, and complex cells otherwise. A total of 226 data sets were collected from 137 neurons (88 complex and 49 simple) from 45 recording sites. Criteria for quantitative analysis were (1) good isolation was maintained throughout the experiments described below, and (2) responses to at least one of the compound gratings were reliable (d′ > 1.0 for the amplitude of any of the first six Fourier components of the response in comparison to the blank condition, or
Feature tuning in V1 neurons
Our aim in this study was to gain insight into how V1 neurons signal and discriminate spatial waveforms, including those that resemble salient spatial features such as edges and lines. These features are presumed salient because of spatial phase congruence. We know that although appropriate symmetryselective filtering is necessary, linear filtering alone cannot explain the underlying featureextraction mechanism. Subcortical visual processing involves nonlinear transformations, but these transformations are primarily related to adjustment of overall gain and dynamics, and are not orientation or feature specific. Thus, the neuronal circuitry that performs feature extraction in primates is almost certainly at a cortical level.
The neuronal implementation of feature extraction, however, is as yet unknown. Natural candidates for the prefilters are V1 simple cells the receptive field profiles of which have the appropriate even or odd symmetry as required by a local energy model. Although the analysis of phase selectivity to spatial compound gratings is a necessary step in understanding the relationship of these neurons to feature extraction, only a few studies of single neurons evaluated this directly: De Valois and Tootell (1983) and Levitt et al. (1990) in the cat, and Pollen et al. (1988) in the monkey. Our study extends these earlier works by examining responses to more complex (f + 3f + 5f + 7f) compound gratings at a closely spaced set of relative phases, and also responses to the components themselves. To obtain good statistical confidence, we typically recorded responses for 100 repeats of each stimulus. With tetrodes, we simultaneously probed multiple nearby neurons, thus examining the local variation of phase selectivity of V1 neurons. These measures allowed us to address questions about spatial feature extraction in V1 that have both neurophysiological and psychophysical implications.
The defining feature of simple cells is the simple, approximately linear fashion in which they appear to sum spatial stimuli within their classical receptive fields (Hubel and Wiesel, 1962), but it is well recognized that this approximate spatial linearity is typically compounded with various types of nonlinearity (Movshon et al., 1978a; Albrecht and Geisler, 1991; Carandini et al., 1997a). Strict linearity mandates that a response contain only components at those temporal frequencies that are present in the stimulus. If simple cells were strictly linear, the amplitude and phase of each harmonic component of their response to the compound grating would depend only on the corresponding component grating in the stimulus. The presence of other stimulus components, or the phase in which they are combined, should be irrelevant. Consequently, if we were to restrict the response measure to a single harmonic present in the stimulus, the magnitude and phase of this response harmonic would be identical for all of the compound gratings, up to a phase offset corresponding to the phase offset in the stimulus. Moreover, responses at even harmonics should be absent, because the stimulus components are restricted to the first four odd harmonics. However, nonlinearities are expected in the response to compound gratings even in simple cells. The most obvious nonlinearity in all V1 neurons is a spike threshold. Other nonlinearities expected in all V1 neurons include contrast gain control (Albrecht and Hamilton, 1982; Bonds, 1989; Heeger, 1992), which is thought to be phaseinsensitive, and pattern adaptation (Maffei et al., 1973; Carandini et al., 1997b,1998), which may be phasesensitive. The aim of the initial analysis was to identify the effects of these nonlinearities in the responses of simple cells to compound gratings. We also asked whether nonlinear responses are tuned to spatial waveforms, and if so, how the tuning is distributed in the population of V1 simple cells.
Responses of a paradigmatic simple cell are shown in Figure3. This layer 4Cα simple cell had little spontaneous activity in the absence of visual stimuli (shown as the blank condition, i.e. a uniform screen of luminance set at the mean of the grating stimuli in Fig. 3 a). Single drifting gratings (those in Fig. 3 a, as well as other sine gratings used for characterizing the neuron; data not shown) elicited responses that seemed close approximations to halfwave rectified sinusoids the modulation frequency of which was that of the first harmonic component of the stimuli. This behavior is characteristic of typical simple cells, both in our data and as previously reported (Movshon et al., 1978a; Skottun et al., 1991). Responses elicited by the set of eight compound gratings are shown in Figure3 b, organized according to the position of the compound gratings in the feature space. This simple cell responded with a robust burst of spikes to the passage of an OFFtransient (luminance decrement), present to variable extent in each of the eight waveforms. Although the transient of the opposite polarity, an ONtransient (luminance increment), is also present in each stimulus waveform, this cell fired only minimally during its passage in most conditions. This sensitivity to spatial contrast polarity is characteristic of a linear spatial integrator followed by a threshold. Because of the threshold, an elevation in firing rate in a linear response to one polarity is not matched by a drop in firing rate to the opposite polarity.
Note the similarity between the response to the full edge (Fig.3 b, true edge) and the response to the stimulus that approximates an edge via its first four components (Fig.3 b, “edge”). For this cell, the response to the full edge is slightly narrower in time. This indicates that the passband of the linear receptive field of the cell was broad enough so that one or more stimulus components of the edge above the seventh harmonic affected the response of the cell. In most neurons, however, responses to the full edge and its truncated approximation were indistinguishable. Thus, the passbands of most neurons were sufficiently narrow so as to exclude the details present in those higher harmonics. This is expected given the average 2–2.5 octave spatial frequency bandwidth (full width at halfheight) of macaque V1 neurons (De Valois et al., 1982).
The above observations were quantified by Fourier analysis. There is a more general reason for doing the Fourier analysis: we have no a priori knowledge of which response component carries feature dependent signals. Although nonlinear interactions may act to enhance selectivity toward a particular spatial feature, this need not be consistent across all response components. First, we consider conventional scalar response measures defined on Fourier amplitudes alone and in combination, the analysis of which is relatively straightforward. Next, we present an analysis of the Fourier amplitudes and phases jointly (as vectors in the complex plane), which is perhaps more demanding, but also more interesting, because the complex measures have larger signaling capacity attributable to the extra degree of freedom in the phases.
Feature tuning in scalar response measures
Figure 4 a shows the analysis of Fourier amplitudes of the responses of the simple cell from Figure 3 to the sine gratings presented alone. Selective tuning to gratings of various spatial and temporal frequencies, drifting at a constant speed, is indicated by the response amplitudes measured at the fundamental frequency of each grating (amplitudes marked withthick bars). Note that the grating contrast was scaled as in the components of an edge: the contrast of first component was three, five, and seven times larger than the contrast of the second, third, and fourth components, respectively. This means that the simple cell was even more sensitive to gratings of high frequencies than this plot indicates, i.e., the highfrequency cutoff in the passband of this cell fell beyond the seventh harmonic, because its response to this stimulus was unequivocal (m = 4 in Fig. 4 a). Nonlinear responses to single gratings are indicated by nonzero components at multiples of the fundamental frequency for each grating. The approximately π/2 ratio of the response fundamental over the DC component of the response is consistent with these components originating from halfwave rectification. (An exact π/2 modulation ratio is expected for a perfect halfwave rectifier).
Nonlinearities are also seen in the response to the full edge (Fig. 4 a, true edge). One manifestation of nonlinearity is the presence of responses at even harmonics, as described above. A second manifestation is that the responses measured at the odd harmonics to either of the compound gratings (Fig.4 b) or the full edge (Fig. 4 a) is not equal to the responses to the corresponding gratings presented alone. For this cell, the individual grating responses would predict that the peak component of the response to each compound gratings or the full edge occurs at the third harmonic frequency (F _{3}), but in fact it occurs atF _{1} or F _{2}. Although some Fourier components above the eighth harmonic temporal frequency (F _{8}) are still significant, the overwhelming part of the response energy is contained in the DC and the first eight components.
For this and other simple cells, examination of the Fourier amplitudes of the responses to compound gratings (Fig. 4 b) reveals thatF _{1} has both the largest response amplitude and the largest variation of amplitude across the stimulus set. At each frequency, linearity predicts identical Fourier amplitudes for all compound gratings. Note that although the approximate constancy of the DC component is consistent with the linear prediction in this simple cell (cell of Fig. 3), which thereby gives the DC component the poorest feature tuning, most other Fourier amplitudes show systematic variation (i.e., tuning) with stimulus congruence phase. Moreover, this tuning seems similar across components. Judging by the maximum amplitude of most components, the optimal waveform for this simple cell has a congruence phase π/2 ≤ φ_{opt} ≤ 3π/4 (between 90 and 135°). By any one of these response measures, therefore, this cell is tuned neither for edges nor lines but for an intermediate waveform.
In general, the nonlinear signature of complex cell responses to the compound gratings is that evenorder Fourier harmonics dominate the response. In the typical complex cell, unlike the typical simple cell, the largest response component as well as the response component with the largest phasedependent modulation is the DC or the second harmonic component, F _{2}. Figure5 shows the responses of six more V1 neurons (mostly complex cells). As a group, these give a sense of the variety of phaseselective responses encountered in V1; individually, each is selected to emphasize a distinct point. Figure 5 ashows the responses of a typical complex cell. For this cell, unlike for the typical simple cell, the poststimulus time histograms for drifting gratings, especially at high frequencies, are unmodulated. For compound gratings, the response histograms for this cell are characteristically bimodal, with a response transient corresponding to the passage of the stimulus transient of both contrast polarities. This contrasts with the unimodal histograms seen for the paradigmatic simple cell (Fig. 3). For each drifting waveform, there are two response peaks approximately half a period apart (in terms of the fundamental), but their size and ratio vary systematically with the congruence phase. Thus, the typical complex cell shows a strong nonlinearity (domination of the response energy by evenorder harmonics), but the phasedependent variation manifest in the size and ratio of the peaks diverges from what is expected of a phaseinsensitive energy operator.
Figure 5, b and c, respectively, shows the responses of the complex and simple cell that had the highest gain and the least noisy responses in our sample. Both follow with high fidelity the higher harmonic modulations present in the stimulus. The simple cell responses exhibit a tendency of firing to be restricted to onehalf of the stimulus period, indicative of dominant oddharmonic Fourier components in the response. The response histograms of the complex cell exhibit the opposite tendency, toward a firing pattern that is replicated in each half of the stimulus period, indicative of dominant evenharmonic Fourier components in its response. However, these descriptions are caricatures, and most cells within our sample of >100 V1 neurons showed intermediate behavior. (The ability of the even and odd response harmonics to signal congruence phase is given in a systematic population analysis below.)
Each neuron discussed so far was typical in that it had a more or less vigorous response to each congruence phase, but with a variable response waveform. On the basis of the response histograms alone, therefore, it is difficult to tell by eye for most neurons whether they are selective to one or the other spatial waveform to any significant degree, and a quantitative analysis of the responses is necessary. However, a minority of the neurons were quite selective to certain waveforms to a degree that was obvious even from a cursory examination of their response histograms. Figure 5 d–fpresents examples of such phaseselective neurons. Figure 5 dshows a complex cell that was broadly tuned to edges. Figure5 e shows another edgeselective complex cell that was quite responsive to the full edge but barely to the fourcomponent edgelike compound grating. For this cell, most grating components probably fell below its passband, but it fulfilled the criteria for analysis based on d′ (see above). This behavior was rare (only 2 of 137 cells in our sample). The final example, a borderline simple/complex cell shown in Figure 5 f, can be described as a (broadly tuned) line detector. This cell preferred an approximately linelike waveform (for the congruence phases tested, the largest peak of the response histogram occurs at φ_{opt} ≅ 7π/8). In general, only a few neurons in the entire sample of 77 V1 neurons that were analyzed exhibited such obvious phase preference.
Some V1 cells (such as the simple cell in Fig. 3) signal variation of congruence phase predominantly in their odd response harmonics, and other cells (such as most complex cells in Fig. 5) signal congruence phase predominantly in their even response harmonics. Therefore, scalar measures of the even and odd response energy are also obvious candidates for further analysis. For the simple cell of Figures 3 and4, some of these measures are examined in Figure6.
The four response measures shown here are the mean firing rate (Fig.6 a, DC), the evenharmonic energy (defined as the summed squared amplitudes of the DC and harmonics 2, 4, 6, and 8). (Fig.6 b), the oddharmonic energy (summed squared amplitudes of harmonics 1, 3, 5, and 7) (Fig. 6 c), and the total response energy (summed squared amplitudes of the DC and the first eight harmonic components of the response) (Fig. 6 d). The linear prediction that the response is independent of congruence phase fails. Each of these response measures systematically depends on the stimulus phase, and, for the three energy measures, this dependence is substantial.
To describe the dependence of each of these response measures on spatial phase, we used the method of least squares to fit a harmonic function of the congruence phase, φ to the response measure,R:
In the circular feature space used here, the sharpness of the tuning to features of a response measure (i.e., its feature selectivity) is naturally measured by the circular variance (CV) of the response measure (Mardia, 1972). The CV is defined as:
For the simple cell in Figure 6, the four response measures, although not equally sharply tuned, yield very similar optimal phases (arrows). This is remarkable because one might expect that they reflect the effects of different nonlinearities. For this cell, the optimal compound waveforms had a congruence phase φ_{opt} ≈ π/3 (120°). The DC was least tuned to congruence phase (any tuning in the DC is attributable to nonlinearities of at least fourth order; see ), and the three energy response measures were about equally selective when measured by circular variance (1 − CV was 0.03 for DC, ∼0.18 for each energy measure).
The analysis shown for the simple cell in Figure 6 was also carried out for the examples of Figure 5 (mostly complex). Figure7 summarizes quite similar results for the DC and the three energy measures. The DC (open circles) usually predicted the same optimal congruence phase, but in most cases was a less selective measure than the energy measures, as quantified by the CV. Although a greater selectivity is expected for the energy measures than for the DC merely because the energy (impulses squared/seconds squared) but not the DC (impulses/second) is a squared quantity, the full extent of the observed selectivity difference is not explained by units of measurement. In the case of the typical complex cell in Figure 7 a, the even energy (squares) and odd energy (triangles) are similarly tuned, but the even energy dominates. The dominance of the response by even energy is even more pronounced in the case of the complex cell in Figure 7 b. In this case, and in the case of the “edgedetector” (Figure 7 d), the even and odd energy are also differently tuned. (Note that although the odd energy is very small, the measured values are highly reliable, as determined by the illustrated bootstrap confidence limits.) However, in most cases when the even and odd response energies were both substantial, such as in the cases of the simple cell (Fig. 7 c) and the line detector (Fig. 7 f), the two scalar measures tended to be similarly tuned. Note that Figure 7, b and cshows the cells with the highest signaltonoise ratios in our sample of V1 neurons; the error bars of the other cells are more typical.
Figure 8 shows an example of how phase tuning varies locally in V1. These four complex cells, recorded simultaneously by a tetrode, exhibit considerable difference in phase sensitivity (gain), selectivity, and preference. This is representative of the variation of these parameters in local V1 ensembles. Cell 1, the cell with the highest gain in this local cluster, and cell 2 are least selective: their tuning curves (Fig. 8 c, left) approximate what would be expected from a strict (phaseinsensitive) energy calculation. In comparison, cell 3 (the least sensitive in this cluster) and cell 4 (the cell comparable in sensitivity to cell 2) are both well tuned but tuned to different preferred phases (Fig.8 c, right). Cell 3 is tuned to a waveform the congruence phase of which is intermediate between that of a line and an edge. (Judged from its responses shown in Fig. 8 a, cell 3 seems simple but it was classified as a complex cell on the basis of its response to the optimal single grating.) Cell 4 is tuned to a linelike waveform.
Another notable point is that responses of cell 4 to compound gratings have a single mode (Fig. 8 b, innermost histograms), much like those of simple cells, but its responses to single sine gratings, except at the lowest spatial frequencies (Fig. 8 a, histograms in rightmost column), consist mostly of spike rate elevation and only weak modulation, the defining characteristic of complex cells. Such apparently mixed behavior was observed in many cells of both classes (as defined by their responses to single gratings) in our sample: simple cells could have strong even harmonic components in response to compound gratings (as in Fig. 7 c), whereas complex cells could have strong odd harmonics in response to compound gratings. Mixed behavior, intermediate behavior between what is expected for an “ideal” simple and ideal complex cell, was reported earlier in cat area 17 neurons studied with contrastreversed single gratings (Spitzer and Hochstein, 1985). However, the mixed behavior observed by those authors was based on absolute phase (position) sensitivity, not on the sensitivity to relative phase (or feature) as observed in this study.
Feature tuning in vector response measures
The energy measures considered above are sensitive to response size but not timing. This extra degree of freedom present in the phases may also make it possible for the responses to encode the stimulus space (a circle), which is of genuinely twodimensional (2D) topology and which the scalar measures are incapable of encoding. To determine whether this is indeed the case, we next consider a joint analysis of the amplitude and phase of response components. We begin this analysis on the simple cell of Figures 3, 4, and 6. Figure9 a shows the dominant response component, F _{1}, plotted as a vector on the complex plane for each of the eight compound gratings.F _{1} is referenced to the phase of the fundamental stimulus component by subtracting the congruence phase φ (Eq. 2) from the measured phase of F _{1}. (This plotting convention corresponds to h = 1 in the .) With this phase reference, a linear response would be represented by the same complex number for each stimulus: the eight plotted responses would all coincide at a single point. The expected position of the linear response is the center of the dark disk(m = 1 alone) in Figure 9 a, which represents the response to the fundamental grating component presented alone. Deviation from this, as indicated by the lawful arrangement of responses on a loop, indicates the effects of phasesensitive nonlinear interactions between the different harmonic components of the stimulus. Because our stimuli, by design, contained only odd harmonics of the fundamental frequency, nonlinear contributions at the fundamental can be attributable only to oddorder nonlinearities. (For details on how our stimulus design determines the frequency and phasesignature of nonlinearities, see the .) Thirdorder interactions, the oddorder nonlinearities with the lowest order, are likely the largest contributors to F _{1}. As detailed in the , thirdorder nonlinearities are of two kinds, with different implications for how their phase dependence affects the shape of the locus plotted in Figure 9.
To get a better view of the details of the F _{1}responses in Figure 9 a, we present an expanded version in Figure 10. One kind of thirdorder nonlinearity that can contribute to F _{1} is represented by the combination F _{1} +F _{k} − F _{k} (seen = 3; p = 1 in and Table A1). The phase of this nonlinear contribution covaries with that of the fundamental because the phases of F _{k} and −F _{k} in the stimulus cancel each other. For these interactions, the convention used for plotting phases in Figure9, namely, offsetting by the phase of the fundamental grating component, will lead to a plotted response vector that is independent of congruence phase. (This is because the congruence phase φ is identical to the phase of the fundamental grating.) That is, these components can contribute to a difference between the average response to the compound gratings and the response to F _{1}alone, but they cannot contribute to differences among the responses to the eight compound grating stimuli. Their contribution is represented graphically in Figure 10 as the displacement between the center of the ellipse (blue star) and the response toF _{1} alone (red disk).
The other kind of thirdorder nonlinear interaction that leads to responses at the fundamental frequency consists of contributions such as F _{3} −F _{1} − F _{1},F _{5} − F _{3} −F _{1}, (n = 3;p = −1 in and Table A1). The raw phase of these responses varies as −φ not φ. Thus, after subtraction of the phase of the fundamental (i.e., the congruence phase φ), their contribution rotates as −2φ. Each of these thirdorder nonlinearities, if present in isolation, would therefore lead to a circular locus for the plot of F _{1}. When combined with arbitrary phases and strengths, their aggregate can thus lead to an elliptical locus for F _{1}. However, because each contribution rotates as −2φ, their aggregate cannot shift the center of the response locus. Thus, the thirdorder interaction of the first type, together with the linear part, determines the center of the response locus.
The data in Figure 10 approximate an ellipse rather than a circle. We show in the that the fifthorder nonlinearities further displace the center of the locus (p = 1 terms), add elliptical distortions to the circle (p = −1 and p = 3 terms) and also add asymmetric distortions (p = −3 terms). Higherorder nonlinearities add even more distortions to the elliptic configuration. In summary, the approximately elliptical locus seen in Figure 10represents the combined effect of third and higherorder nonlinearities.
The net result of these nonlinearities is that the locus of theF
_{1} response depends strongly on the stimulus profile. The strong modulation of the amplitude is comparable to what was seen for the oddharmonic energy (Fig. 6
c), which was dominated by the contribution of F
_{1}. As we show in the , an elliptic approximation of the configuration of the Fourier harmonics of the response in the complex plane captures the contributions from nonlinearities up to a certain order (order 4 forF
_{1}). Thus, for descriptive purposes, we fit an ellipse to the set of eight data points, forcing an equal phase separation of corresponding points on the ellipse (Fig. 10, white dots on the blue ellipse, indicated by blue arrowheads):
Although the fitted ellipses may deviate detectably from the data because of the presence of highorder nonlinearities, they provide a useful summary of the response to the eight stimuli. Both the statistical significance and the usefulness of this summary depend on the response variance, but in a different manner. The total variance in the data (i.e., the variance of the responses to each trial of each congruence phase) consists of two parts: first, the scatter caused by trialbytrial variation of the responses to each congruence phase, and second, the dependence of the mean response on congruence phase. Trialbytrial variation of the Fourier components are measured by the 95% confidence regions estimated by theT
For the example cell shown in Figure 10, the fitted ellipse runs through the 95% confidence regions of each response (gray disks). However, when the eight data points (black crossed centers of gray disks) are considered together, their overall deviation from the fit (white dots onellipse indicated by blue arrowheads) is beyond the range expected from measurement error (χ
The most impressive feature of this plot, and one that is typical of the other data sets, is that the responses capture the topology of the stimulus space. That is, the orderly progression ofF _{1} in a single loop around the ellipse mimics the progression of the congruence phase around the phase circle. One can also see that the response fundamental of this cell measured for the true edge (blue circle) and its truncated approximation with only four components are indistinguishable (their error circles fully overlap in Fig. 10). Because none of the error circles corresponding to the eight compound gratings overlap with one another or with the origin, this cell can detect and identify each waveform based just on the response fundamental, provided thatF _{1} amplitude and phase are jointly considered. Amplitude alone primarily distinguishes waveforms the congruence phase of which is in the range 0 ≤ φ_{opt} ≤ π/2 (top half of the ellipse) from those waveforms with a congruence phase that is in the range π/2 ≤ φ_{opt} ≤ π (bottom half of the ellipse). The phase of F _{1}primarily discriminates along the orthogonal direction within the stimulus space, i.e., between the edge and linelike waveforms.
To compare response characteristics across neurons, we define the optimal congruence phase to be the phase parameter on the ellipse that corresponds to the most distant point from the origin. That is, it is the congruence phase of the stimulus that leads to the largestF _{1} response, as interpolated by the fitted ellipse. With this definition, the optimal congruence phase for this cell is φ_{opt} = 0.68π rads ≡ 122°), a stimulus that is intermediate between edge and line (see snapshots in Fig. 10). This corresponds closely to the optimal stimulus as inferred from the scalar measures, shown in Figure 6.
Figure 9, b and d, show that the higher response harmonics F _{2} andF _{4} also depend on the congruence phase of the stimulus in a similar way as F _{1}, butF _{3} (Fig. 9 c) has a different behavior. Its dependence on stimulus phase is much less prominent (the mean responses are indistinguishable by spatial phase as the overlapping error circles indicate) than forF _{1}, F _{2}, andF _{4}. Moreover, the F _{3}response to every compound grating is less than the linear prediction, i.e., the F _{3} response to the second component grating, which contains this frequency alone (red). (The amplitude of this response was also shown in Fig. 4 a.) That is, the spatial nonlinearities contributing toF _{3} that are elicited by the compound gratings are all antagonistic to the linear contribution to theF _{3} response. In sum, although nonlinear interactions may act to enhance selectivity toward a particular spatial feature, this need not be consistent across all response components. Nevertheless, in this and most other cells, most of the significant Fourier harmonics of the response tended to be maximal for the same stimulus waveform.
The results of a similar analysis of response harmonics in the complex plane is summarized in Figure 11 for the six V1 neurons shown in Figure 5 and 7. For each cell, one or more of the representative Fourier components are plotted with the conventions of Figure 9. The remarkably different levels of signaltonoise ratio among these cells are evident from the very different sizes of the error circles in these plots (for clarity, some of the error circles were omitted). As explained above for the simple cell of Figure 10, the goodness of the elliptical fit is typically greater for neurons that have responses of lower signaltonoise ratio. In general, the harmonics at which the response was largest generally were also the most phase selective. For example, the highfidelity complex cell (Fig. 11 b) is well tuned inF _{2} and F _{4} but poorly inF _{1}. Conversely, the “linedetector” (Fig.11 f) is remarkably tuned in the odd harmonicsF _{1} and F _{3}, which are also the largest. The locus on the complex plane of the odd harmonic responses of this phaseselective neuron is well approximated by an elongated ellipse the long axis of which is aligned with the direction from the origin. This alignment, together with the narrow eccentricity, maximizes phase sensitivity and selectivity. However, this neuron is less well tuned in the even harmonics: theF _{2} (data not shown to prevent overlap withF _{1}), the largest even harmonic, is comparable in amplitude to F _{3}, but its elliptic locus is less eccentric and tilted at an angle relative to the direction of the origin. A comparison of these results with those of Figure 7 reveals that the optimal congruence phase predicted by the ellipses fitted to the various Fourier components of the responses corresponds very well with the optimal phase values deduced from the scalar response measures.
Feature tuning in the V1 population
Figure 12 a summarizes the comparison of the optimal congruence phase obtained from scalar response measures (as in Fig. 6) and vector response measures (as in Fig. 10) for the population of simple cells. Figure 12 b is a similar population summary for the complex cells. These are summaries of the 94 data sets (36 simple, 58 complex) that both passed the original d′ criterion for analysis, and the additional criterion on trialbytrial variance for the complex plane analysis.
For each response measure (indicated by labels above the top row of graphs), the wedge diagrams on the diagonal show the distribution of the optimal congruence phase. The area (not the radius) of each wedge is proportional to the frequency of cells the optimal congruence phase of which fell into the corresponding range of phases. Optimal phase of each cell is indicated by a dot at the corresponding direction on the perimeter. The wedge diagram indicates gross deviations from the uniform distribution on the circle; its details are sensitive to binning. The Rayleigh test (Mardia, 1972) quantifies deviation from uniformity toward a unimodal distribution. By the Rayleigh test (performed on the optimal congruence phases before binning for the wedge diagrams), the nullhypothesis of uniform distribution on the circle is rejected if the sample mean is significant (*p < 0.05; **p < 0.01; ***p < 0.005). For both simple and complex cells, some response measures had a small but significant population bias in the optimal phase toward the congruence phase of the line (φ_{opt} ≈ 0). Arrows indicate significant population biases, and surrounding wedges indicate the 95% confidence intervals on the direction of bias, as estimated from circular standard error (Fisher, 1993). In simple cells, a significant bias is found only in the second response harmonic (F _{2}, h = 2, one of the two possible forms of analysis) and in the first response harmonic (F _{1}, h = 1), where the distribution is apparently bimodal (this was not tested). In complex cells, the bias is significant in all vector measures examined and also in the odd energy. Significant bias was also found in the same four response measures when all 94 data sets (simple and complex) were analyzed together. In all cases where there was evidence for deviation from uniformity toward a unimodal circular distribution, the 95% confidence limits around the bias angle included the line phase. The population bias for simple and complex cell populations was statistically indistinguishable for all measures except the odd energy and F _{1} (p < 0.05; twosample circular mean test) (Fisher, 1993).
The various response measures are not exactly equivalent measures of phase tuning. Comparisons of optimal congruence phases for pairs of response measures show a range of correlation, as seen in the scattergrams above the top diagonal of Figure 12. In each scattergram, each data point corresponds to a single cell, and compares the optimal phase angles for a pair of response measures, with the horizontal axis of the scattergram corresponding to the row measure (indicated by the row label of the scattergram) and the vertical axis of the scattergram corresponding to the column measure (indicated by the column label of the scattergram). Note that both axes refer to periodic variables. If two response measures predicted identical optimal phases, then all points would fall on the diagonal line of unity slope with zero phase difference between the two predictions. If the optimal phases predicted by two response measures are fully uncorrelated, then the data would be evenly dispersed within a stripe of π width centered on the diagonal line of unity slope. To make these observations more precise, for each pair of measures of optimal congruence phase, we tested for the presence of a linear relationship between them: φ_{1, opt} = φ_{2, opt}+ φ_{diff}. We defined the circular correlation coefficient by a normalized vector quantity calculated from the circular covariance (Fisher, 1993) of the two sets of phasecongruence values. The circular covariance is a complex number of magnitude ≤1. Its modulus r _{c} is analogous to the absolute value of a linear correlation coefficient and indicates the strength of correlation. Its angle is the mean difference φ_{diff} between the two sets of congruence phases, as estimated from circular regression, and not the algebraic mean.
As indicated by the values of r _{c} (Fig.12, top left corner of each scattergram), some pairs of measures were highly correlated (*p < 0.05; **p < 0.01; ***p < 0.005). One such pair is total power versus total even power for both cell classes (Fig. 12 a, b). Other pairs of measures were less tightly correlated, and the nullhypothesis of random association on the circle could not be rejected (p > 0.05), e.g., for both cell classes, the F _{2} measures versus the odd response power or the DC in Figure 12, a andb. In simple cells but not complex cells, measures based onF _{1} and odd response energy were significantly correlated with those based on DC, theF _{2}, and the even response energy (Fig.12 a). Only in simple cells was the F _{1}component comparable to the even harmonics, or odd response energy comparable to the even response energy, so it is not surprising that tuning measures based on these quantities behaved in a more random fashion in complex cells.
In simple cells, the sensitivity (as defined by the gain) of the response measures was a good predictor of their feature selectivity (as quantified by the selectivity index, 1 − CV). Note that in simple cells, the even harmonics were often of comparable sensitivity as well as selectivity to the odd harmonics [median (1 − CV) of the odd versus even response energy: 0.185 versus 0.183; p > 0.2; N= 36; Wilcoxon's pairedsample signed rank test of medians]. However, in complex cells, the even harmonic response measures were typically the more sensitive, but the odd harmonics tended to be the more selective to relative phase [median (1 − CV) of the odd versus even response energy: 0.24 versus 0.12;p < 10^{−5}; N = 58; Wilcoxon's pairedsample signed rank test of medians]. The odd harmonics were even more selective in complex cells than in simple cells (p = 0.002; N = 58 + 36; Kolmogorov–Smirnov twosample test). These observations are not as counterintuitive as they might seem, given the nature of the stimuli. Only phasesensitive nonlinearities can contribute to selectivity, because all stimuli have the same components and the same overall power. In simple cells, the odd harmonics carry large responses, but their main contributors are likely to be linear, which will dilute any feature selectivity of the nonlinear contribution at the odd harmonics. Conversely, a contrast polarityinvariant nonlinearity, well known in the responses of complex cells “OnOff” transients, is likely to produce a large but phaseinsensitive response at the even harmonics, which dilutes any stimulus selectivity that other nonlinear contributions might confer.
Strong correlation between two circular quantities does not preclude systematic phase differences between them. A systematic phase difference obtained on the population would indicate that two response measures have a relative bias in estimating the optimal feature, which in turn would mandate caution in using one or both measures for this purpose. To examine whether optimal congruence phases predicted by different response measures had such a difference, we plotted the distribution of phase difference in histograms shown in the bottom diagonal matrix for each pair of response measures. The mean phase difference φ_{diff} estimated from circular regression, together with 95% confidence limits estimated from bootstrap, are indicated by line intervals below the histograms. With one exception, φ_{diff} was not significantly different from 0. The sole exception was a small difference seen in the data for simple cells for the moderately correlated pair of the second Fourier harmonic and the total response energy (Fig. 12 a;F_{2}, h=2 vs ALL in bottom row). Because this is an a posteriori finding (1 among the 21 possible phase differences that were evaluated), it is not likely to be physiologically meaningful.
In summary, response measures that were robust in general were quite sensitive to congruence phase, well correlated with one another, and tuned to very similar optimal phases. The caveat is that the robustness and sensitivity of a response measure to relative phase depends on the extent and ratio of phasesensitive nonlinear contributions to the odd and even Fourier harmonics, which can vary from cell to cell.
For all measures, the optimal phase is rather evenly distributed in V1, which indicates that V1 neurons can signal the symmetry of spatial waveforms with little bias. We do see a modest but significant population bias in the optimal congruence phase predicted by many response measures toward what corresponds to the linelike (even symmetric) waveform. This raises the possibility that there is a small but significant relative abundance of cells sensitive to line (but not to edges or odd symmetric spatial waveforms) among the V1 neurons. An alternative explanation could lie in the unequal Michelson (peak) contrast of our compound gratings: smallest for the edgelike, largest for the linelike waveform (although all were equal in contrast energy). In this scenario, all other factors being equal, apparent abundance of neurons would correspond to the peak contrast, and in turn, the relative efficacy of the stimulus they are selective for. However, the argument that Michelson contrast and not contrast energy is the effective stimulus for V1 neurons does not seem compelling, because it would predict both that lineselective neurons are apparently most abundant and that edgeselective neurons are apparently least abundant. This is not supported by the wedge diagrams in Figure 12. One possibility that is consistent with our data is that there is a small relative abundance among V1 neurons for both edge and line preference over all others (i.e., an underlying moderately bimodal circular distribution centered on the odd and evensymmetric waveforms), as might be anticipated from the psychophysics of relative phase sensitivity and discrimination, such as the results ofBurr (1980). Then the unequal Michelson contrasts might magnify the line preference and reduce the edge preference, leading to distributions of apparent congruence phase preference similar to those that we found. However, it is important to note that the deviation from a uniform distribution is, at most, moderate.
Discrimination of spatial features in V1 neurons
The analysis so far has focused on what spatial features drive V1 neurons optimally, and addressed this question via various traditional response measures. This approach demonstrated that there is a certain degree of arbitrariness in choosing a response measure and that the optimal congruence phase may depend on this choice. Similar problems arise when the traditional response measures are used to address a related question: how accurately can V1 cells report the differences in features, i.e., how well can an observer of the spike responses of these neurons discriminate the waveforms of pairs of compound gratings? Discrimination of features depends on response selectivity (i.e., how sharply the response depends on the stimulus parameter of interest), not response size. For example, the response of an orientationselective neuron is maximized by a stimulus of optimal orientation (tuning peak), but the orientation discrimination sensitivity of the neuron is maximal at the steepest slopes on the rising and falling edge of the orientationtuning curve (Vogels and Orban, 1990). Additionally, selectivity depends on the intrinsic variability of the response measured as trialbytrial variability when the stimulus parameter of interest is held constant. In sum, it is not obvious what response measures (traditional or otherwise) are most useful for discrimination. These problems are largely circumvented by informationtheoretic analysis that is applied directly to the entire spike responses. This permits a more rigorous answer to questions about feature tuning and phase discrimination.
Shiftreduced Fourier metric
These preliminaries motivated our analysis of neuronal phase discrimination in V1. We used a metric space method (Victor and Purpura, 1996 1997) to estimate information from the entire spike response rather than just a single extracted variable such as the spike count or any Fourier component. Furthermore, because the stimuli were periodic, we used the variant of the metric space analysis (Mechler et al., 1998b) based on Fourier harmonics of the response. We applied this approach to examine discriminability of each pair of stimuli. The method consists of three stages: first, calculation of dissimilarity measures; second, spike train clustering; third, calculation of transinformation. The dissimilarity of two spike trains is measured by the Euclidean distance between the two vectors composed of n selected Fourier components of each spike train. These vectors are of dimension 2n, because each Fourier component has both a real and an imaginary part. The clustering or classification of each spike train response consists of labeling it with the stimulus that elicited the set of responses that, on average, is closest to it. If the responses to different stimuli are reliable and distinctive, then every spike train will be correctly classified as to the stimulus that elicited it, but if the responses to different stimuli are intermingled and difficult to distinguish, many spike trains will be reassigned to the wrong stimulus. For the joint analysis of a set of m stimuli, the correct and incorrect tallies are summarized by an m by m confusion matrix. For a pair of stimuli, the confusion matrix is a 2 × 2 table. The transinformation is computed from the confusion matrix in the standard way (Cover and Thomas, 1991). We corrected the information estimate for the smallsample bias by subtracting the average result of a repeated analysis that used shuffled data sets that were constructed by random reassignment of spike trains to stimuli (Victor and Purpura, 1997). This bias was usually very small (<0.02 bits), because there were only two stimulus categories and 75–100 repetitions of each (Treves and Panzeri, 1995).
As we saw earlier, the responses of a neuron to various compound gratings often had similar magnitude and even similar waveforms, but different phases. Thus, much of the information present in these neuronal messages about the gratings is carried by the absolute response phase. However, such phase information cannot be used by an observer to discriminate stimulus waveforms, because the absolute phase is confounded with the absolute starting time of the stimulus cycle of the drifting waveforms. This might suggest an alternative experimental design, more along the lines of psychophysical studies, in which the compound gratings are presented as stationary targets. The temporal confound would be eliminated but in its place would be a spatial confound: that of spatial features and absolute positional information. Although this confound could be eliminated by a dense sampling of stimulus position, doing so would lengthen the experiment by a substantial factor.
The alternative is to recognize that in the absence of knowledge of absolute starting time, discrimination of stimulus waveforms can only be based on intrinsic features of the response. This provides a solution to the temporal confound, as follows. We assume that the responses to each cycle of the stimulus are independent and that spikes within each part of the cycle are identically distributed across trials. That is, the absence of knowledge of absolute starting time can be modeled by allowing spike trains recorded during a single stimulus cycle (Fig. 13 a) to be wrapped around the circular stimulus cycle (Fig. 13 b,c). Only features of spike trains that can be distinguished even after an arbitrary wrapping are available to discriminate the spatial profiles. That is, in the absence of knowledge of absolute starting time, the intrinsic difference between two responses is the minimum distance that can be achieved after any shift of one spike train relative to the other around the stimulus cycle. We call this minimum distance a shiftreduced Fourier metric (Fig. 13 d).
Shiftreduced Fourier metrics can be constructed from any set of Fourier components. Enlarging the subset of Fourier components approximates a waveform increasingly well, but variations in the response carry information about the stimulus only up to a certain temporal precision (Mechler et al., 1998b). Thus, it is necessary to survey how information depends on the set of Fourier components on which the metrics are based. We therefore examined truncated Fourier series including all components up to a variable highest harmonic n. Figure14 a shows the results of this survey for one of the more sensitive simple cells (also shown before in Fig. 5 c). The line versus edge discrimination is optimized (i.e., information is maximized) when Fourier components up to F _{8} are included in the analysis. Including finer temporal details of the response, especially components above F _{32}, worsens discrimination, indicating that these harmonics are primarily noise.
Similar observations held for the discrimination of all other stimulus pairs (Fig. 14 b). Most stimulus pairs, especially nearby elements of the feature space, evoked very similar numbers of spikes. Thus, the DC alone allowed very poor or no discrimination of compound gratings (an ordinate of close to 0 bits at the 0 point on the abscissa). The same held for a shiftreduced Fourier metric that included only the first harmonic in addition to the DC (abscissa = 1). This is because the metric aligns the phase of the first harmonic, leaving only the DC response and the first harmonic amplitude to discriminate among the stimuli. Because these measures often varied little across stimuli, discrimination on their basis is necessarily poor. Discrimination increased substantially when multiple Fourier components were included in a shiftreduced metric, because it was typically not possible to identify a single phase shift that simultaneously brought multiple harmonics into alignment. Thus, once multiple harmonics were included in the shiftreduced metric, relative temporal phase can play a role in discrimination of stimulus pairs. For most pairs of compound gratings, discrimination information increased with the inclusion of response components up toF _{8}. The information curves typically cut off above F _{16}, indicating that higher components carried no independent stimulusrelated message and/or were too variable to be useful. Similar observations were made for most neurons (with a few exceptions, where information peaked at a lower harmonic, but the sharp decline in discrimination was aboveF _{8}). Therefore, we used the shiftreduced Fourier metric based on the DC and the first eight harmonic components of the response for subsequent information calculations.
The fact that the responses of most neurons carried temporally encoded information about the spatial waveform of these stimuli primarily in their first eight Fourier components reflects, of course, the temporal frequency content of the stimuli. For simple cells, this might be considered a trivial observation, but for complex cells it is not, because their responses do not merely mimic the temporal modulation of the stimulus, even for standard sinusoidal gratings. Moreover, the nontrivial nature of this observation is demonstrated by changing the drift speed of the compound gratings while keeping the spatial frequencies constant. A fourfold increase in the drift speed (12°/sec instead of 3°/sec) changes the temporal fundamental of the stimuli fourfold (from 1 to 4 Hz). Correspondingly, the time represented by a constant phase shift by any particular harmonic is decreased fourfold. Nevertheless, for most neurons, the peak of the information curve obtained with the shiftreduced Fourier metrics remained at the eighth response harmonic of the stimulus fundamental. This means that the precision of the encoded temporal detail increased fourfold “in tune” with the fourfold increase in the temporal frequency content of the stimulus. For only 10 data sets (20% of high speed data), information curves peaked for Fourier metrics that used truncated series up to and including the second or fourth rather than the eighth harmonic of the fundamental. The maximum precision of temporal coding of spatial detail in these cells (defined by the half period of the fourth harmonic) is ∼30 msec, and in those neurons with an information peak that remained at the eighth harmonic truncation, the precision limit would be 15 msec or better. These numbers correspond well to earlier reports of temporal coding of temporal phase (Geisler et al., 1991) and spatial phase (Victor and Purpura, 1998).
Neuronal discrimination thresholds for congruencephase
One way to summarize the information measurements obtained with the shiftreduced Fourier metric shown in Figure 14 b is to construct threshold functions by analogy to psychophysical threshold functions (Fig. 14 c). To measure neuronal thresholds, neurometric functions using spike counts were used by Movshon and colleagues (Newsome et al., 1989); a similar approach but different response measure was used for measuring neuronal temporal phase discrimination for auditory and visual stimuli by Geisler et al. (1991). To establish the difference of congruence phases at threshold in discriminating a line from other compound gratings, we plotted information (in bits) for each pairwise comparison of the linelike waveform with the other stimulus waveforms. As a threshold criterion, we chose 0.32 bits, corresponding to Burr's 82% correct performance criterion (Burr, 1980) in a twoalternative forcedchoice (2AFC) task with the two choices a priori equally probable. Thus, the neural threshold for congruence phase discrimination was defined by the abscissa of the intersection of the performance curve (interpolated between the eight measured values) by the criterion line. This procedure provided two onesided thresholds: one on each side of the linelike waveform on the phase circle (Fig.14 c, inner pair of vertical arrows). These two onesided thresholds were then averaged to obtain the threshold phase for the line discrimination (≈ 0.1π for the cell in Fig.14 c). A representative human psychophysical threshold (Burr, 1980) is also shown in Figure 14 c(outer pair of vertical arrows). By this analysis, this macaque V1 simple cell appeared to be more sensitive than the human observer, at least for the condition where the linelike waveform was the test stimulus to be discriminated. However, this high sensitivity was not typical of V1 neurons (see below).
In a similar manner, discrimination thresholds can be calculated with each of the other waveforms serving as the reference. Figure14 d shows (open symbols) discrimination threshold (difference of congruence phases) as a function of the congruence phase of the reference waveform. This threshold function summarizes waveform or 1D feature discrimination for this simple cell.
The cell shown in Figure 14 was the most sensitive to congruence phase in our sample. The information content of pairwise discrimination based on single responses was often close to the ideal maximum of 1 bit (perfect discrimination), especially for pairs of compound gratings with congruence phase that differed by π/4 or more. The threshold curve, however, showed a systematic dependence on congruence phase: discrimination of compound gratings from the linelike waveform (difference of congruence phases at threshold ≈ 0.1π) was superior to discrimination of compound gratings from the edgelike waveform (difference of congruence phases at threshold ≈ 0.18π).
Although this cell, one of the most sensitive to differences in spatial waveform in our V1 sample, can perform phase discrimination for linelike stimuli at least as well as the normal human observer (Fig.14 d, filled symbols) (converted to our units from the Burr study), it is well outperformed by the human observer for discrimination of edgelike stimuli. The pattern of dependence of threshold on congruence phase in this cell is exactly the opposite of the human. Thus, we ask whether this discrepancy holds in V1 neurons in general. We also determine how spatial feature discrimination in V1 neurons compares with that in human observers for the typical neuron and not just for the most sensitive V1 neurons.
To answer these questions, we analyzed all data that passed our initial criterion for analysis (as stated in the first section of Results: 121 data sets from 31 simple cells and 46 complex cells). Within this group, we computed the phasediscrimination threshold functions for all neurons in the subset that met the 82%correct threshold criterion. This turned out to be a stringent criterion: only ∼10 % of this group (20% of the simple cells and 5% of the complex cells) met this criterion for at least one reference phase (Fig.15 a). Of these most sensitive V1 neurons, eight data sets from five simple cells, including the example in Figure 14, and 1 complex cell, met the 82%correct criterion for all eight congruence phases used as reference. Connecting lines of different types indicate five of these threshold curves. Only isolated data points are plotted for the three remaining neurons, at which measurable thresholds (at the 82% criterion) were identified for a subset of the eight reference congruence phases. The human threshold curve measured by Burr (1980) is plotted again for comparison (Fig. 15 a, filled symbols connected withdotted line). Neuronal thresholds were typically much larger than the psychophysical human thresholds, even in the most sensitive simple cells. The lower boundary of the macaque data traces the curve of the human thresholds in most conditions, but this may be a mere coincidence. Individually, the phasediscrimination thresholds in these most sensitive neurons display quite a varied dependence on the congruence phase of the test stimulus. Across this admittedly small sample, the shape of this dependence, as summarized by their median (Fig. 15 b, connected stars) does not match what was observed in the human observer.
The close agreement between the human thresholds and the lower boundary of the data in Figure 15 a is consistent with a winnertakeall mechanism in V1 that could account for the psychophysical thresholds (Parker and Newsome, 1998). However, our analysis focused on the most sensitive individual neurons (as defined by a somewhat arbitrary threshold criterion) and ignored the bulk of the population. The stringent threshold criterion excludes most neurons in V1 that have phasesensitive responses, simply because they would not suffice to signal congruence phase in isolation. To consider the contributions of the less sensitive neurons, we relaxed the threshold criterion from 82 to 68% correct, from the equivalent of 0.32 to 0.1 bits of information. This 70% drop in the equivalent information at criterion resulted in a fourfold increase in the size of the most sensitive subset of V1 neurons. Approximately 40 % of the analyzed data (for both simple and complex cells) met this relaxed threshold criterion at least at one test phase condition (Fig.15 b). The median threshold across neurons (connectedstars) reaches the maximum possible relative phase near the edgephase, because even with this lower criterion, half of the neurons had nonmeasurable thresholds. Again, median dependence on congruence phase is different from what is seen for the human observer. Thus, to account for the pattern of the human thresholds, V1 activity must be “read out” in a manner that does not just sum up the activity of individual neurons and most likely involves phasespecific processing.
Varieties of feature discrimination in V1 cells
Figure 15 summarizes, for single neurons and for an arbitrary threshold level, the dependence of phase discrimination on the reference congruence phase. However, there is no guarantee that phase discrimination is a uniform and monotonic function of phase difference. Rather, the phase discrimination capabilities of a single neuron may be a more general function of the congruence phases of the two stimuli to be discriminated. Indeed this is the case, as the discrimination functions for six cells, representative of our V1 sample, show (Fig.16). The x andyaxes of these plots represent the congruence phase (i.e., position in feature space) of the two compound gratings in a discrimination pair. The height of the surface is phase discrimination, measured as information obtained with the shiftreduced Fourier metric. Perfect discrimination of two stimuli corresponds to 1 bit, the maximum possible information in paired comparison. The trough touching zero bits on the diagonal represents selfcomparison (i.e., the absence of discrimination of two identical stimuli). Because the feature space has period π, these plots also have period π in both x andy. The plots are necessarily symmetric across the main diagonal, because our measure of discrimination is independent of which grating is considered the reference.
In these plots, featureselective discrimination corresponds to elevated surfaces (peaks or ridges) in the narrow vicinity of the congruence phase of the selected waveform. Discrimination that is a monotonic function of phase difference would result in a surface that increases monotonically in height with increasing distance from the diagonal (and its periodic repeats at intervals of π); this behavior is seen in Fig. 16 a–d but not e or f. The ideal energy operator would respond equally vigorously to each of the waveforms and thus would be associated with a flat surface at zero, because its response provides no discrimination.
Figure 16 a shows a sensitive simple cell and Figure16 b shows a sensitive complex cell. The information surface for both cells has a prominent offdiagonal ridge (near maximum offset in feature space, i.e., the locus x −y = π/2). The ridge is of approximately uniform height, and along the axis perpendicular to this ridge the information surface is roughly symmetric. This signifies equal discrimination of all pairs of stimuli that are equally offset in feature space and maximal discrimination of pairs that are maximally offset in feature space (separated by π/2 congruence component phase). For such neurons, only the relative positions in feature space of the compared stimuli matter. Because these cells discriminate equally different pairs of waveforms equally well, they are tuned only for differences in waveform. Therefore, we could describe them as featurenonselective opponent. Within our sample, these cells are the least noisy and the most sensitive (in terms of bits of peak information) units that show this kind of behavior.
The simple cell in the middle left (Fig. 16 c; this is the paradigmatic simple cell of Fig. 25) has a different behavior. Rather than diagonal ridges, the information surface has ridges parallel to the coordinate axes, with peaks on the lines φ_{1} = 2π/8, φ_{2} = 2π/8 and falling to near zero at intermediate positions. This behavior is most evident from examination of the contour lines (replotted separately in inset). This pattern in the information surface approximates an “XOR”type phase discriminator; namely, there is substantial discrimination between a pair of features when one feature is near φ = 2π/8 and the other is not. Other cells in our sample displayed XOR pattern tuned to various congruence phases.
The last three examples (Fig. 16 d–f) all have one or more prominent discrete peaks (plus their periodic replicates as required by the symmetry and periodicity of the plot) rather than ridges. The complex cell in Figure 16 d has a single peak at (φ_{1} = 0π/8, φ_{2} = 4π/8), exactly on the maximally offset offdiagonal. On the basis of the responses of this neuron, edgelike and linelike compound gratings are discriminable, but all other pairs of compound gratings would be confused. This behavior could thus be called featureselective opponent. Interestingly, this neuron was recorded simultaneously with the complex cell of Figure 5 e that responded only to the full edge but not to the fourcomponent compound gratings.
The last two cells in Figure 16 show another narrowly tuned behavior, which could be called featureselective nonopponent. Remarkably, they are tuned to discriminating pairs of waveforms that are very similar to each other (i.e., occupy nearby positions in feature space and form neardiagonal positions in these plots), but they do not discriminate stimuli that are in opponent positions in our feature space. The simple cell (Fig. 16 e), with its single discrimination peak at (φ_{1} = 0π/8, φ_{2} = 1π/8), resolves the differences between two similar linelike waveforms but confuses all other pairs. The complex cell (Fig. 16 f) performs a similarly narrow discrimination on not one but two pairs of waveforms, with one peak at (φ_{1} = 3π/8, φ_{2} = 4π/8), a pair of edgelike waveforms, and another at (φ_{1} = 5π/8, φ_{2} = 6π/8), a pair of waveforms intermediate between edge and line. The relatively low sensitivity (as measured by the small information values near peak) of this cell is typical of the V1 population as a whole.
Every cell in our V1 sample had, in pure form or in some combination, features displayed by one of these three examples. Most neurons had a ridge (indicating some form of nonselective behavior). In most cases, the ridge was near the maximum offset in feature space (the offdiagonal locus x − y = π/2), i.e., featurenonselective opponent behavior, whereas a few displayed XOR like rectangular crossed ridges. The information surface in approximately half the neurons had a peak, either in isolation (the narrowly tuned featureselective opponent or featureselective nonopponent behavior) or superimposed on a ridge. The peak location along the ridge (i.e., the position in feature space that maximizes discrimination) varied across neurons.
The average discrimination, taken as the information surface averaged across neurons, displays a mixture of these features. Figure17 a shows the average waveform discrimination in V1 simple cells, calculated across 43 data sets. The average information is dominated by the featurenonselective opponent ridge (at maximum offset in feature space). Superimposed on this ridge near (φ_{1} = 0π/8, φ_{2}= 4π/8) is a slight elevation, indicating a mild population bias for the selective line versus edgetype opponency. At this peak, discrimination in the average simple cell reaches 0.125 bits, or 70% correct. (Here, information is converted to the equivalent fraction correct in a 2AFC paradigm, as defined at Fig. 14.) The complex cell average (N = 78) is shown in Figure 17 b. This exhibits very similar features to the simple cell average (same major ridge, same location for the superimposed elevation) but has approximately half the simple cell sensitivity: 0.056 bits (64% correct) at peak discrimination. The overall V1 average (Fig.17 c) also has the same features, with levels of discrimination intermediate between the average simple and complex cells (∼0.08 bit, or 66.6% correct at peak). That is, the typical V1 cell, but not the most sensitive ones, when considered in isolation, does not reach 70% correct and cannot reliably distinguish any of these compound gratings.
Unimpressive as these information values are in the average V1 neuron, the levels of selectivity and the variety of patterns exhibited by the V1 neurons most tuned to phase discrimination—and these include complex cells, not just simple cells—are all the more impressive. The observed selectivity patterns, such as the selective and nonselective opponency, XOR, and nonopponent selectivity, can be considered as the elementary operators of a “feature algebra.” That is, combining these behaviors via addition and multiplicative interactions could give rise to genuine narrowly tuned detectors and discriminators of arbitrary 1D spatial features, which would be found most likely at an extrastriate stage.
Finally, we address the relationship between the optimal congruence phase defined as the tuning peak (predicted by the fitted functions such as in Fig. 6 or the ellipses in Fig. 9) and the congruence phase of the features corresponding to the discrimination peak (identified by the maximum information on the information surfaces such as those in Fig. 16). Within the domain of orientation, the relationship between tuning and discrimination is straightforward: it is determined by the slope of the tuning function, however, for the problem at hand, the relationship is more complicated. For example, consider the simple cell in Figures 6 and 9 again. The optimal congruence phase was φ_{opt} ≈ 0.65π, with the precise value depending on which scalar measure or Fourier harmonic is used. The congruence phase of the maximally discriminable feature was φ_{pk} ≈ 0.25π (Fig. 16 c). Note that this phase corresponds to the minimum response size, not the maximum absolute value of the derivative of the tuning curves in Figure 6 or the maximum rate movement of the response along the ellipse loci in Figure 9. There are many other examples of such discrepancies in our sample. The existence of multiple response measures per se is not the basis for this discrepancy, because the response measures often have similar tunings. Rather, to provide for feature discrimination that is independent of absolute spatial position, interrelationships between these response measures (e.g., the relative phase of Fourier components) must be used, and the reliability and sensitivity of such interrelationships need not be tightly linked to the individual tuning curves. This line of reasoning is clearly sufficient only to provide an intuitive basis for understanding their existence.
Spatial phase discrimination: comparison of simple and complex cells
Finally, we analyze the relationship between sensitivity to spatial features and the traditional simple/complex classification of V1 neurons. The traditional view, based largely on conventional tests with bars and simple grating stimuli, holds that the quasilinear nature of simple cells allows them to convey precise positional and spatial phase information, whereas the nonlinear spatial integration that distinguishes complex cells markedly reduces the positional and phase information that they can transmit (Movshon et al., 1978b). An alternative view is that the simple and complex cells form a functional continuum, rather than a dichotomy. Within the context of the latter view, the classaveraged positional (spatial phase) sensitivities are considered to be well segregated for the two traditionally defined cell classes, although the distributions might have a considerable overlap.
We examined whether these notions extended to discrimination of spatial features, which requires both spatial phase sensitivity (considered to be characteristic of simple cells) and spatial nonlinear interactions (considered to be characteristic of complex cells). The conventionally used quantitative classifier of V1 neurons is the modulation ratio (Skottun et al., 1991), which is calculated as the ratio of the response amplitude at the fundamental frequency of an optimal grating over the mean spike rate. We examined the correlation of the modulation ratio with two measures of neuronal spatial feature sensitivity in our V1 sample. The first measure that we used was the peak of the information surface of a cell's discrimination of pairs of compound gratings, such as shown in Figure 16. This is a measure of the peak discrimination sensitivity of the cell for spatial congruence phase. We included in this analysis 159 data sets that consisted of the 121 data sets used in all preceding analyses presented in Results (all that passed the original d′ criterion for analysis), plus some of the data sets that contained well isolated but poor responses (randomly selected subset of those that did not pass the d′ criterion). The rationale for including these additional data sets was to analyze a more realistic sample of V1 neurons, rather than a subset biased toward neurons of high feature sensitivity. As a second measure, we used the lowest phase discrimination threshold. This was calculated as the minimum, across all test congruence phases, of the difference of congruence phases of the spatial waveforms that the cell could discriminate from the test waveform at the 68%correct threshold criterion (i.e., the minimum within each cell of the thresholds shown in Fig. 15 b). Only about onethird (N = 52) of the data sets used for the first measure qualified for this analysis; for the remainder, no pair of congruence phases could be discriminated at this criterion level.
Figure 18 shows the relationship of the modulation ratio to the two measures: peak sensitivity (Fig.18 a) and lowest threshold (Fig. 18 b). The top panels are scatter plots of either measure against the modulation ratio, considered as an index of cell classification. A positive correlation was expected if simple cells possessed significantly greater feature discrimination sensitivity than complex cells (Fig.18 a). Conversely, a negative correlation was expected if simple cells possessed significantly lower feature discrimination thresholds than complex cells (Fig. 18 b). In fact, neither scatter plot shows any significant dependence of these measures on the index of cell class (r < 0.1). Additionally, the data show no evidence for a dichotomy in feature discrimination along the class index, either at the class boundary (a modulation ratio of 1) or anywhere else. Most V1 neurons, whether simple or complex, exhibit weak discrimination of congruence phase. The most sensitive neurons (top half of the scatter plot in a, bottom quarter inb) form a small but nondistinct minority within the continuum of the overall V1 population. Although the simple cells (left half of each scatter plot) seem slightly overrepresented among the most sensitive neurons, the most sensitive neurons also include complex cells (right half of each scatter plot).
The middle panels in Figure 18 show the marginal distribution of each measure of feature sensitivity. For either one, the Kolmogorov–Smirnov test did not reject the null hypothesis that the simple and complex samples came from the same distribution. However, the similarity of the distributions in simple and complex cells comes with an apparent relatively higher abundance of simple than complex cells among the most sensitive neurons. Is this significant? For the analysis in Figure 18 a, we selected 159 data sets of a total of 226 recorded (59 of 74 simple, and 100 of 152 complex). Of these, only 21 data sets (10 simple and 11 complex data sets) exhibited peak sensitivity of 0.2 bits or higher (an arbitrary threshold for high feature sensitivity, indicated by the arrow in themiddle panel of Fig. 18 a). This translates to a higher proportional representation of simple than complex cells, 14% (17%) of all (analyzed) simple data sets versus 7% (11%) of all (analyzed) complex cell data sets, but not to a statistically significant degree (p > 0.1 by 2 × 2 χ^{2}). The higher proportional representation of simple cells evaporates for less stringent criteria. We mentioned in connection with Figure 15 that the proportion of simple and complex cells that passed the 68%correct threshold criterion, a wider subset of neurons, was practically identical (∼20% of all data sets and 36% of data sets analyzed, within each cell class). Thus, depending on the criteria, only ∼10–20% of V1 neurons exhibit notable feature sensitivity. Complex cells make up a smaller fraction than simple cells of this subset, but nevertheless are well represented even in the subset of most sensitive cells.
The distribution of the modulation ratio in both samples (Fig. 18,bottom panels) exhibits the well known bimodality with the usual dip near 1. Thus our sample, as conventional classification is concerned, was typical of the results of others (Skottun et al., 1991). The ratio of the simple/complex population size was, depending on the subset analyzed, ∼1:2–2:3, also in the range usually observed in the macaque V1 overall (Skottun et al., 1991).
Summary
We used compound gratings consisting of a fundamental and its first three odd harmonics the congruence phase of which was varied systematically. This stimulus set was well suited for dissociating linear and nonlinear mechanisms (only odd harmonics were present), and also for dissociating phasespecific nonlinearities from those sensitive only to stimulus power (all waveforms were of equal energy). Moreover, the use of compound gratings containing several harmonics, and the choices of relative phases, facilitated an interpretation in terms of extraction of simple features (i.e., the phase coherences typical of lines and edges). With these stimuli, we obtained a number of novel findings:
(1) Phasespecific nonlinear interactions, up to fifth and higher orders, are present with similar strength and harmonic composition in both simple and complex cell responses. In simple cells, odd harmonic responses were of slightly higher gain then even harmonics, but they were equally selective for spatial feature. In complex cells, even harmonics were of much higher gain, but the odd harmonics were more selective.
(2) The preferred congruence phase in phasetuned V1 neurons is relatively poorly predicted by the spike count, but equally well predicted by other scalar (e.g., energy) and vector (significant Fourier harmonics) response measures.
(3) Cells were encountered that are tuned to edges, lines but also intermediate stimuli with mixed symmetry. Although a few cells that approximated a phaseinsensitive energy operator were also encountered, even most complex cells were tuned to congruence phase.
(4) Although the distribution of the preferred congruence phase in V1 was broad, with all congruence phases represented, the population as a whole, regardless of cell class, displayed a slight bias toward lines and possibly edges. This may represent a genuine relative abundance of even symmetry preference among V1 neurons.
(5) Feature preference and selectivity also varied within a local cluster of V1 neurons.
(6) Information analysis using a method that was insensitive to absolute phase or position (the shift reduced Fourier metric) revealed that V1 neurons encoded relative phase in the entire frequency band present in the stimuli (as limited by the passband of the linear filter of the receptive field).
(7) The envelope of feature discrimination thresholds, estimated from information analysis, in the most sensitive V1 neurons (5% of complex cells and 20 % of simple cells) matched the human psychophysical thresholds, but the dependence of feature discrimination threshold on congruence phase showed many patterns in single V1 neurons, and most differed from the pattern in human observers.
(8) The responses of most cells were rather noisy and displayed low sensitivity to relative phase. A minority of V1 neurons that included both simple and complex cells were highly sensitive and selective. The peak discrimination sensitivity of the average V1 neuron does not reach the level of 70% correct. Simple cells on average were twice as sensitive at peak as complex cells. However, the distribution of peak feature sensitivity and the lowest threshold of feature discrimination were both indistinguishable between simple and complex cells and indicated a continuum in V1.
(9) The existence, among sensitive neurons, of nonlinearities tuned to feature pairs or feature differences, suggests that a subpopulation of V1 neurons does more than simply pass on the relevant information necessary for spatial feature detection to a downstream stage.
DISCUSSION
Visual perception relies on the successful identification and localization of features such as lines and edges that define the shape and boundaries of objects. These features arise from spatial phase congruence, the local agreement of spatial phase of multiple harmonic components of an image. However, despite their importance, the neural computations underlying detection and discrimination tasks based on relative spatial phase are unclear. Because feature extraction requires phaseselective nonlinear filtering, the earliest stage at which this operation is likely to occur along the visual pathway is in the primary visual cortex. Our experiments were designed to test whether V1 neurons can account for the specificity of feature extraction for lines and edges by studying their sensitivity to relative phase.
Physiology: comparison with other studies
Only a few studies have examined the sensitivity of single neurons in the primary visual cortex to relative spatial phase or phase coherence. In anesthetized cats, De Valois and Tootell (1983) recorded the responses of striate cortical neurons to rigidly drifting compound gratings composed of pairs of spatial sinusoids, one of which was always near the optimal spatial frequency. These authors noted large suppressive or facilitative interactions in most simple cells and in onethird of complex cells. Their analysis, however, was restricted to the Fourier amplitude at the fundamental frequency for simple cells and to the mean firing rate for complex cells, and thus ignored response phase and waveform. Our results indicate that limiting the analysis to those response components could miss most of the phasespecific frequency interactions. Also in striate cortex of the anesthetized cat, Levitt and colleagues (1990) recorded the responses to f + 2ftype spatial compound gratings that were presented both drifting as well as in counterphase modulation. Although they emphasized that the temporal response properties of neurons could confound their spatial selectivity (Dean and Tolhurst, 1986), they did no attempt to analyze the receptive field nonlinearities. Any attempt to do so would have been difficult because responses to f + 2f stimuli would confound stimulus components with loworder nonlinear interaction frequencies.Pollen and colleagues (1988), in monkey V1, did attempt to analyze the nonlinearities contributing to feature selectivity and used drifting f + 3f compound gratings, more appropriate for this purpose. They concluded that responses of simple cells were fully explained by filtering by their linear receptive field followed by a nonlinearity (threshold). To account for the responses of complex cells they advanced an energyoperator model that sums simple cell responses in appropriate (quadrature) phase combination. As pointed out above, an energy model such as this cannot account for our findings in complex cells. We are currently working on more elaborate models that could account for the major findings presented here.
Spatial feature discrimination in simple and complex cells
We characterized discrimination of elementary features (on the basis of relative spatial phase) according to several measures and found little if any correlation between these measures and the simple/complex distinction. This result is unexpected from the traditional view of the phase sensitivity of simple and complex cells. High versus low spatial phase sensitivity in simple versus complex cells is widely held as one of the clearest and most quantitative indicators of a functional dichotomy between these neuronal classes (Movshon et al., 1978a,b; De Valois and De Valois, 1980;Hamilton et al., 1989; Skottun et al., 1991); however, the usual class distinctions based on phase sensitivity are derived from responses to simple gratings or bars. Here, we obtained a very different view about simple and complex cells by using more complex stimuli. Spatial phase sensitivity alone does not suffice to provide for spatial feature discrimination; joint processing of spatial phase across multiple spatial frequencies combined by a spatial nonlinearity is required as well. Spatial phase sensitivity is generally considered to be more prominent in simple cells, whereas spatial nonlinearities are generally considered to be more prominent in complex cells. Viewed in this manner, it makes sense that spatial feature discrimination does not fall squarely into the province of one cell type or the other. Indeed, the need for both aspects of receptive field structure for this very crucial operation of early vision provides a rationale for an organization of cortex not into two distinct classes but into a continuum (Dean and Tolhurst, 1983; Chance et al., 1999; Mechler and Ringach, 2002).
Correspondence with psychophysics of feature detection and discrimination
The psychophysics of absolute and relative spatial phase discrimination has been amply studied [for a review of much of the early work see De Valois and De Valois (1980)]. Relative phase discrimination threshold was studied, among others, byBurr (1980). A static f + 3fcompound grating was presented to the observer with variable relative phase. At moderate (∼10%) contrasts, phase discrimination thresholds were ∼30°. This was relatively constant across almost the full range of test congruence phases, except that phase discrimination thresholds were markedly elevated for linelike waveforms, and somewhat elevated for edgelike waveforms. This last point was given a tentative explanation based on the steepest slope principle in conjunction with the widely held assumption [supported by several psychophysical studies reviewed by Burr (1980)] that phaseselective mechanisms of the visual system are predominantly tuned to edgelike and linelike waveforms.
Our results on neuronal phase discrimination thresholds obtained at the 82%correct equivalent criterion are consistent with the notion that a winnertakeall mechanism among the V1 neurons most sensitive to relative phase can account for Burr's psychophysical thresholds, but this explanation carries several caveats: (1) we used five times higher contrast than was used for the psychophysics; (2) we optimized fundamental spatial frequency for the bandwidth of the cell, but psychophysics was done with a fixed 3 c/° fundamental, which was near optimal for the observer but presumably not for all of his V1 neurons; (3) we used four harmonic components (f + 3f + 5f + 7f), making richer and potentially more salient waveforms than the twocomponent (f + 3f) stimulus that was used in psychophysics. All of these differences act to favor the neuronal thresholds; without them, even the best single neuronal thresholds would be expected to be higher than human psychophysical thresholds.
Computational implications for the functional circuitry in visual cortex
Although the most sensitive simple and complex cells (e.g., those of Fig. 15 a) show discrimination approaching psychophysical levels, the correspondence with human thresholds typically did not hold for all test phases for any single neuron. Thus we must consider the possibility that pooling of signals from several single neurons is necessary to account for the psychophysical thresholds. Across V1 neurons, we observed a wide variety of pattern in the dependence of threshold on test phase; most were different from the dependence observed in human thresholds. Furthermore, unlike the situation for orientation and spatial frequency, the observed variation in phase selectivity among nearby neurons is considerable. Thus, although spatial pooling likely plays a considerable role, it must be phase specific, rather than locally indiscriminate.
We encountered V1 neurons with a phase discrimination function that was tuned to feature pairs (e.g. selective featureopponent) (Fig.16 d–f), selective presence of a feature (XOR) (Fig. 16 c), or feature differences (nonselective featureopponent) (Fig. 16 a,b). Their existence raises the possibility that V1 neurons actively participate in feature detection and discrimination and do not merely act as linear filters that pass on the relevant information to downstream detector mechanisms. On the other hand, because (as we have shown) V1 neurons do not account for the pattern of feature sensitivity, later stages must do more than merely improve thresholds.
It is unclear how these precursors of feature discriminators in V1 could be constructed, within the known functional circuitry of V1. Creating nonselective featureopponent cells by pooling signals from many variously tuned selective featureopponent cells encounters some difficulty, given the observation that both simple and complex cells can act as nonselective featureopponent discriminators. On the other hand, building nonselective featureopponent cells from selective ones with the aid of selective inhibition leaves open the question of where the selectivity of inhibition arises. One way to resolve these difficulties might be to invoke feedback from extrastriate visual areas, such as V2, rather than to attempt to build these properties solely from the intrinsic circuitry of V1, but this must be considered speculative at present. These and other questions, such as those related to the cortical processing of 2D shape information, should motivate experiments in extrastriate cortex that could expand on these efforts.
Appendix
A formalism for compound grating stimuli
We consider general compound grating stimuli, and then specialize to the particular drifting compound gratings used in these experiments. This approach adds little notational inconvenience and clarifies what is generic to compound gratings and what is specific to our choice of stimuli.
We assume that the compound gratings are composed of Lsinusoidal components, each at integer multiples of a common fundamental frequency f. The set of integer multiples are denoted {k(1), k(2), … ,k(L)}, and the corresponding frequencies are denoted {F _{k(1)},F _{k(2)}, … ,F _{k(L)}}. ThusF _{k(j)} is the frequency of thejth component, which is thek(j)th harmonic of a common fundamental, i.e., F _{k(j)} =k(j)f. We use the vectork⃗ as a compact notation for the set of components {k(1),k(2), … , k(L)} and use similar vector notation for other analogous sets of indices below.
Each single sinusoid of a compound grating can be written as a sum of a complex exponential and its complex conjugate pair. This decomposition allows for convenient tracking of the phase and frequency of nonlinearly interacting components. Correspondingly, a sum of sinusoidsW
_{k⃗}
(t) corresponding to a compound grating of components k⃗ can be written as a sum of exponential terms and their complex conjugate pairs, one for each temporal frequency component:
Frequencies and phases in the linear response
The essence of a linear system is that its response contains components only at frequencies that are present in the stimulus and that the phase of each component of the response has a constant offset with respect to the phase of the corresponding component of the stimulus. Thus, a linear response, R
Nonlinear interaction frequencies in the response
The R(t) response of nonlinear system to a sinusoidal sum W
_{k⃗}(t) can typically be well approximated by an appropriately chosen sum,
Distinct frequency ntuples in Equation EA3 (i.e.,ntuples of the same order n with indicesm⃗ and m⃗′ that differ by more than merely a permutation, and perhaps ntuples of distinct orders n and n′) may give rise to the same output frequency. However, as we show below, these output components will generally have different dependences on the phases of the input frequencies and thus may be separated. The compact notation of Equation EA3has another potential ambiguity. Frequency ntuplesm⃗ that differ only by a permutation are separately listed in Equation EA3; despite the fact that they reflect identical interactions (and necessarily have identical dependences on the phases of the input frequencies). Eliminating these multiple listings is important for normalizing nonlinear kernels (Victor and Knight, 1979) but not for the present purpose of tracking the interaction frequencies and their phases.
Permitted intermodulation frequencies and their phases in our stimulus design
Our compound grating stimuli are constructed from the first four odd harmonics of a fundamental f, namely,k⃗ = {1, 3, … , 2L − 1} andL = 4. Moreover, each component has the same congruence phase, φ, fixed to a value specific for the stimulus. Using the sign convention established above, the properties specific to our compound gratings are summarized as:
Contributions to any given response harmonic r may include nonlinear interactions of the same parity but with different orders n, and for each n there may be many alternative ntuples of stimulus frequencies that all sum to a given r. These several contributions to the same frequency component of the response are not equally weighted for three reasons: (1) the contrasts of the components in the stimulus,a_{k} , are inversely proportional to frequency, (2) interactions at lower orders of nonlinearity are likely to be larger than interactions at higher orders of nonlinearity, and (3) the tuning of the neuron is likely to have different sensitivities to each of the component gratings. On the basis of only the generic considerations 1 and 2, the interactions due to n tuples composed of a small number of lowfrequency harmonics are expected to be larger than those composed of highfrequency harmonics, or a larger number of harmonics. For example, of the many contributions to the first harmonic response, F _{1}, the third order contributions from the triplets f +f − f and −f −f + 3f are expected to be much stronger than the third order contribution to F _{1} from the triplet 3f + 5f − 7f, and both are likely to be stronger than the fifth order contribution toF _{1} from the quintuplet f +f − 3f − 5f + 7f. The several nonlinear contributions to a specific response frequency are not only weighted differently, but they also have different but lawful dependence on the congruence phase φ that describes the stimulus. This phase signature is explored below.
By design, the relative phases of the components are fixed in each of our compound gratings at the congruence phase. That is, φ_{k} = φ for k > 0 and φ_{k} = −φ for k < 0. Consequently, according to Equations EA3 and EA4, the phase φ_{k(m⃗)} of thenth order nonlinear response at the frequenciesF
_{k(m⃗)} =rf will be an integer multiple p of the congruence phase φ of the components. The value of this multiplierp depends only on the number of + and − signs with which the component frequencies are summed in the ntuples in Equation EA5. That is:
The phase signature of nonlinear responses
We now describe how, for a given response harmonicr, the realized set of phase multipliers p(organized in a column under p in Table A1 across alln orders of nonlinearity) determine the geometry of the locus of responses on the complex plane. As the congruence phase φ varies, the phase of the contribution of a response with phase multiplier p varies as pφ. For example, a response component with multiplier p = +1, whether it is linear or nonlinear, has the same phase offset with respect to the stimulus cycle, independent of φ. Thus, as φ varies on the [0, π) halfcircle, the contribution of this component to the response also runs a half circle on the complex plane, in the same direction as the stimulus. A nonlinear response with p= −2, however, will describe a full circle on the complex plane, and this trajectory is in the opposite direction to the rotation of stimulus phase. Independent of the order n of nonlinearity, response components with the same phase multiplier p will have a relative phase that varies in the same way with φ. Thus, it suffices to consider the geometry of each of the pfold “wrappings” of the circle and how these contributions add.
As described in the main text, we plot responses in the complex plane, after rotating their phases backward by an amount −hφ for some integer h. Typically, h = 1 (as in Fig.9 a,c) or h = 0 (as in Fig. 9 b,d). The rationale is that response components with phase multipliersp = 1 (which includes the linear part) will, in the absence of noise, be plotted as a single vector independent of φ. More generally, response components with phase multiplier pwill generate contributions the phases (on these plots) of which vary as (p − h)φ. It suffices to examine the contributions of such components for phase multipliersp and nonlinearity orders n that are of the same parity as the response harmonic r. Each response component with p = h determines a response, which after the phase adjustment of −hφ is a constant vector, and thus the sum of these responses is a constant vector as well. All contributions with p = h + 2 will together determine a circle, because the amplitude of each component is independent of φ, and the phases (in this plot) will covary as 2φ. All contributions with p =h − 2 also trace out a circle. These two circles may have different radii and starting phases, and the responses move in opposite directions around them. The geometric locus on the plane of the vector sum of pairs of points in phase correspondence, on two concentric full circles that are traversed in opposite directions (p = h + 2 and p= h − 2), with possibly different amplitudes and starting phases, is an ellipse. Tilt and eccentricity of the ellipse depend on the difference in the parameters of the two circles. Addition of the contributions with p = h simply shifts this ellipse away from the origin.
To see that these circles indeed sum to an ellipse, consider the (x, y) coordinates (the real and imaginary parts, respectively, on the complex plane) of the points that are in phase correspondence on the two circles. The (x, y) coordinates of the p = h + 2 trajectory, parametric in phase φ, can be represented asx = r _{+} cos(2φ + ϕ_{+}), y = r _{+}sin(2φ + ϕ_{+}), where r _{+} is the amplitude of the (vectorsum) total contributions of the nonlinearities with p = h + 2 and φ_{+} is its phase. Correspondingly, the (x, y) coordinates of the trajectory on the p =h − 2 circle can be represented as x =r _{−} cos(2φ + ϕ_{−}),y = r _{−} sin(2φ + ϕ_{−}). By substituting a =r _{+} + r _{−},b = r _{+} −r _{−}, θ = (ϕ_{+} + ϕ_{−})/2, φ_{0} = (ϕ_{+} − ϕ_{−})/2, coordinate summation will result in Equation 5 of the main text, which is a parametric form for an ellipse. This completes the proof that any combination of responses with ‖p − h‖ ≤ 2 will generate an elliptic locus on the complex plane, as the congruence phase of the stimulus spans the [0, π) halfcircle.
Nonlinear responses with ‖p − h‖ > 2 will generally contribute to distortions of the ellipse (such as concavity or symmetry breakdown). We will show below that an elliptic fit to the phase plot of the response harmonics at output frequencies up to the third harmonic F _{3} provides an adequate approximation to nonlinearities of orders up to and including the fourth. Conversely, significant departures in theDC, … , F _{3} data from an ellipse (when plotted parametric in the congruence phase of the stimulus) indicate the presence of nonlinearities of order 5 or higher. Significant highorder nonlinearities identified in the responses of a neuron suggest the existence in the neuron's receptive field mechanism of a static nonlinearity with a singularity, such as halfwave rectification.
F _{1}: The response measured at the fundamental temporal frequency F _{1} includes all components listed in Table A1 under the column r = 1, including but not necessarily limited to a linear part (n = 1). When plotted with h = 1, the contribution from the linear part will be stationary. That is, its phase multiplier isp = 1, and ‖p − h‖ = 0. The thirdorder contributions (n = 3), as seen from Table A1, are characterized by phase multipliers p ε {+ 1, − 1}, so p − h ε {0, 2}. Thus these terms can produce a circle but not an ellipse. The fifthorder contributions (n = 5) are characterized by pε {+ 3, +1, −1, −3}. The first three of these, p ε {+ 3, +1, −1,} or ‖p − h‖ ε {0,2}, thus could produce an ellipse. However, p = −3 corresponds to ‖p − h‖ = 4, and an intermodulation of three frequency components with this sign pattern leads to a distortion from an elliptical locus. Still higher oddorder nonlinearities (n = 5, 7, …) will contain additional components with ‖p − h‖ ≤ 2, and also ‖p − h‖ ≥ 4. Thus, whenF _{1} responses are plotted with the phase correction h = 1, the parameters of the bestfitting ellipse reflect the combined influence of thirdorder and higherorder responses, but the distortion is attributable solely to fifth and higher oddorder nonlinearities.
F _{3}, F _{5}, etc: Plots of these higher harmonic responses with the phase correctionh = 1 will also yield, at worst, an ellipse, unless nonlinearities of order n = 5 or higher are present. For example, the response measured at the temporal frequencyF _{3} includes all components listed in Table A1 under the column r = 3. These include a linear component (elicited by the F _{3} grating, in then = 1 row) and also nonlinear responses (in the rowsn = 3 and n = 5; and for higher n, data not shown). For h = 1 andn = 3, only terms with p ε {+ 3, +1, −1,} are present, i.e., ‖p − h‖ ≤ 2 for all terms up to third order. Fifthorder nonlinearities (n = 5 row) include a single termF _{7} − F _{1} −F _{1} − F _{1} −F _{1} with p = −3. Because ‖p − h‖ = 4 for this term, it can produce a distortion of the locus from an ellipse. A similar analysis holds for F _{5} (r = 5 column of Table A1) and higher odd harmonics. Note, however, that responses at odd harmonics higher than F _{7} will not have a linear component, because such components were not present in the stimulus by design, and intermodulations for certain patterns, such asp = −3 for (n, r) = (5,5), are not possible to realize because of constraints presented by Equation EA5.
F _{2}, F _{4}, etc: The responses at the even harmonics of the fundamental temporal frequency do not contain a linear part. For this reason, plotting their responses with a phase correction h = 1 is no longer natural. Rather, one could choose to reference even harmonic responses to the phase of the second order (lowest even order) contributions. Second order contributions to all even harmonic response frequencies come in one of two intermodulation patterns: either with difference frequencies (p = 0; e.g.,F _{5} − F _{1} in columnr = 4 of Table A1) or with sum frequencies (p = 2; e.g., F _{1} +F _{1} for r = 2). Compensating for the phase of either of the difference or sum patterns (h = 0 or h = 2, respectively) seems to be an equally valid way to examine the even harmonic responses. Either way, the secondorder intermodulations with the pattern chosen for phase compensation would appear stationary, whereas intermodulations with the other pattern would define the radius of a circle (because ‖p − h‖ = 2). Specifically,h = 0 makes differencefrequency components stationary and h = 2 makes sumfrequency components stationary. Fourthorder nonlinear contributions to F _{2} (rown = 4 and column r = 2 of Table A1) are restricted to p ε {−2,0,+2,}. Choosingh = 0 leads to ‖p − h‖ = 0 or 2, and thus, at worst an elliptical locus, but choosingh = 2 may lead to ‖p −h‖ = 4, and thus, to distortions of an ellipse. However, fourthorder nonlinear contributions to F _{4} and higherorder even harmonics (row n = 4, and columnsr = 4 and higher of Table A1) include permitted patterns that allow for ‖p − h‖ ≥ 4, no matter what is chosen for h (e.g., for r= 4, F _{1} + F _{1} +F _{1} + F _{1}has p = 4 so that ‖p −h‖ = 4 if h = 0, whereasF _{7} − F _{1} −F _{1} − F _{1} hasp = −2 so that ‖p − h‖ = 4 if h = 2). Therefore, a distortion of the locus from an ellipse is expected in the plot of such nonlinear components. Using arguments similar to those in the discussion of the odd harmonics, one can show that nonlinearities of fourth and higher evenorders contribute to the elliptical locus and also to distortions from an elliptical locus for all even harmonics, except for the DC component.
DC: The spike count is a special case of even harmonics because it only has a real part. That is, every term in Equation EA3 is accompanied by its complex conjugate. A plot of the DC response, a scalar, is equivalent to plotting the DC response as a complex value with h = 0, and projecting it on the real axis. Table A1 (column r = 0) shows that contributions to DC from secondorder nonlinearities (n = 2) all havep = 0, and are thus independent of the congruence phase φ. Contributions from fourthorder nonlinearities (n= 4 of Table A1) include terms with p = 0 and alsop = ±2. The former terms are phase independent; the latter terms correspond to the projection of motion around an ellipse onto the real axis, i.e., a sinusoidal function of congruence phase φ (see Eq. 4 in Results).
Footnotes

This work was supported by National Institutes of Health Grants EY9314 (J.D.V., F.M.), EY7138 (D.S.R.), and GM7739 (D.S.R.).

Correspondence should be addressed to Ferenc Mechler, Department of Neurology and Neuroscience, Weill Medical College of Cornell University, 1300 York Avenue, New York, NY 10021. Email:fmechler{at}med.cornell.edu.