We explore the hypothesis that binocular simple cells in cat areas 17 and 18 show subregion correspondence, defined as follows: within the region of overlap of the two eye’s receptive fields, their ON subregions lie in corresponding locations, as do their OFF subregions. This hypothesis is motivated by a developmental model (Erwin and Miller, 1998) that suggested that simple cells could develop binocularly matched preferred orientations and spatial frequencies by developing subregion correspondence.
Binocular organization of simple cell receptive fields is commonly characterized by two quantities: interocular position shift, the distance in visual space between the center positions of the two eye’s receptive fields; and interocular phase shift, the difference in the spatial phases of those receptive fields, each measured relative to its center position. The subregion correspondence hypothesis implies that interocular position and phase shifts are linearly related. We compare this hypothesis with the null hypothesis, assumed by most previous models of binocular organization, that the two types of shift are uncorrelated.
We demonstrate that the subregion correspondence and null hypotheses are equally consistent with previous measurements of binocular response properties of individual simple cells in the cat and other species and with measurements of the distribution of interocular phase shifts versus preferred orientations or versus interocular position shifts. However, the observed tendency of binocular simple cells in the cat to have “tuned excitatory” disparity tuning curves with preferred disparities tightly clustered around zero (Fischer and Krüger, 1979; Ferster, 1981; LeVay and Voigt, 1988) follows naturally from the subregion correspondence hypothesis but is inconsistent with the null hypothesis.
We describe tests that could more conclusively differentiate between the hypotheses. The most straightforward test requires simultaneous determination of the receptive fields of groups of three or more binocular simple cells.
- binocular cell
- simple cell
- cat visual cortex
- striate cortex
- disparity tuning
- owl visual Wulst
There is considerable evidence that the preferred orientations and spatial frequencies of simple cells in cat area 17 are determined by the spatial arrangement of ON and OFF subregions in their receptive fields (RFs) (Movshon et al., 1978a;Jones and Palmer, 1987a; Ferster et al., 1996), as originally suggested by Hubel and Wiesel (1962). However, the relationship between the subregions in the right- and left-eye RFs remains unclear, primarily because of the difficulty of determining the precise alignment of the eyes during physiological experiments.
There are several proposed explanations for the disparity-modulated response properties of simple cells, and these make different predictions for intereye subregion relationships. The traditional, “position-based” model proposes that left- and right-eye RFs of simple cells differ only by their strengths of input and a possibleposition shift in the locations of their RF centers, Fig.1 a, but that they have identical internal organization of ON and OFF subregions (Hubel and Wiesel, 1962; Maske et al., 1984). Such position shifts have been shown to exist (Barlow et al., 1967; Nikara et al., 1968; Joshua and Bishop, 1970). However, the reverse correlation technique revealed that left- and right-eye RFs often also differ by a phase shift in the arrangement of their ON and OFF subregions relative to their RF centers (Freeman and Ohzawa, 1990; DeAngelis et al., 1991, 1995a; Ohzawa et al., 1997). The phase of each eye’s RF can vary with time (DeAngelis et al., 1993, 1995b), yet the phase shift between them remains remarkably constant (DeAngelis et al., 1995a; Ohzawa et al., 1996). These observations led to the proposal of a “phase-based” model (Freeman and Ohzawa, 1990; Nomura et al., 1990) (Fig. 1 b), which emphasizes that all categories of shapes observed for disparity tuning curves can be produced through these phase shifts alone, with or without accompanying position shifts.
Because both position and phase shifts have been shown to occur in the same set of cells (Anzai et al., 1997), both must be included in any viable model. Such so-called “hybrid” models (Jacobson et al., 1993) have to date allowed position and phase shifts to be independently distributed (Fleet et al., 1996; Zhu and Qian, 1996).
We have hypothesized a particular form of hybrid model in which position and phase shifts are not independently distributed. This model arose from our theoretical studies of correlation-based, activity-instructed development of layer 4 of cat area 17 or 18 (Erwin and Miller, 1996, 1998). These studies addressed how each binocular simple cell can develop approximately identical preferred orientation and spatial frequency in its two monocular RFs, assuming that development is guided by correlations in the activities of monocular ON- and OFF-center lateral geniculate nucleus (LGN) inputs. We found one simple solution to this problem in which, throughout the region of overlap of the two eyes’ RFs, each ON-center subregion in the left-eye RF spatially overlaps only with an ON-center subregion in the right-eye RF and similarly for OFF-center subregions (Fig. 1 c). This solution allows both position and phase shifts but requires that they be linearly related in a specific way, as we shall show. We call this the “subregion correspondence” model.
The model studied by Erwin and Miller (1998) can only develop binocular matching of preferred orientations by developing either subregion correspondence or subregion anticorrespondence (ON subregions in one eye coincide with OFF subregions in the other eye, and vice versa). These two alternatives result from quite different LGN activity correlation structures and so are not likely to codevelop (but see Discussion). Subregion anticorrespondence is not a good candidate to fit existing data in the cat and so will not be further studied here. As discussed by Erwin and Miller (1998), adding more complexities to our developmental model could conceivably allow other binocular RF relationships to emerge, either in layer 4 or in other cortical layers. However, presently the only developmental models shown to be capable of generating binocular RFs with matched preferred orientations and spatial frequencies (Erwin and Miller, 1996, 1998; Shouval et al., 1996) produce only cells that obey subregion correspondence (or anticorrespondence). Hence, it is worthwhile to examine the plausibility of this prediction.
In this article, we compare the predictions of two hybrid models: one in which position and phase shifts are constrained to produce subregion correspondence and an “unconstrained hybrid” model in which position and phase shifts are uncorrelated. We show that data on binocular response properties of individual simple cells are equally consistent with both of the hybrid models. However, data on thedistribution of preferred disparities for binocular cells in cat areas 17 and 18 strongly favor the subregion correspondence model. In addition, although both hybrid models are equally consistent with observed relationships between interocular phase shift and preferred orientation, only the subregion correspondence model allows this relationship to emerge from a developmental process in which binocular RF organization has no explicit dependence on preferred orientation. Finally, we show that both hybrid models are equally consistent with existing joint measurements of interocular phase and position shifts (as determined relative to reference cells). Additional such measurements, involving groups of three or more simultaneously measured cells, are required to definitively decide between the models.
MATERIALS AND METHODS
Here we present basic definitions and assumptions as well as the mathematical tools that will be used to derive our results.
Corresponding retinal points. We seek to represent positions in both eyes’ visual fields using a single coordinate system. To do so, we must assume the existence of corresponding retinal points (CRPs), such that a one-to-one correspondence can be established between points in the left and right eyes’ visual fields.
There are many ways in which CRPs can be defined. The simplest, and most common, definition is geometrical. Points on the two retinae that are at the same angular and radial position relative to their respective foveae are said to be in geometrical retinal correspondence. When both eyes foveate a distant star, the image of each other star in the sky falls on geometrically corresponding retinal points.
Many studies have determined mean right-eye and left-eye retinotopic positions that provide input to single positions in cortex (Barlow et al., 1967; Nikara et al., 1968; Joshua and Bishop, 1970; von der Heydt et al., 1978; Cooper and Pettigrew, 1979; Pettigrew et al., 1984;Pettigrew and Dreher, 1987). These studies have shown that the mean left- and right-eye RFs at single cortical positions need not represent geometrical CRPs (cf. Barlow et al., 1967, page 336) and in fact show systematic deviation from geometrical correspondence as a function of cortical position. It is common to refer to the RFs of such cells as having a fixed disparity relative to the set ofgeometrically defined corresponding points. We have found it more convenient to define physiologically correspondingretinal points to refer to the mean RF locations on the retinae of cells at a single position in the cortex or any other structure in which both eyes share a common retinotopic organization, such as superior colliculus or LGN. The locations of physiological CRPs determined in V1 correlate better with the psychophysically determined region in three-dimensional space in which objects are seen singly by the two eyes (Hering, 1864; von Helmholtz, 1866; Hillebrand, 1893) than does the region determined from geometrical CRPs (see discussion inTyler and Scott, 1979; Tyler, 1991).
Physiological CRPs based on RFs of cells in the LGN can be defined as follows. Because individual LGN cells are monocular, one aligns the mean center points of the sets of cells in neighboring groups across the border between left- and right-eye layers within the binocular visual field of one LGN. LeVay and Voigt (1988) used such measurements at a single location between layers A and A1 of cat LGN to monitor for eye movements and to determine a single pair of binocularly corresponding points. Pettigrew and Dreher (1987) have made similar measurements at single points between the A and A1 layers as well as between the C1 and C2 layers of the cat LGN; they reported that the physiological CRPs so determined deviate systematically both from geometrical CRPs and from each other, but that each agrees with physiological CRPs determined in the cortical area to which it projects (area 17 for A/A1, area 19 for C1/C2). Such a measurement at a single point will suffice to determine physiological CRPs throughout a region of the visual field only if the local mapping between physiological CRPs in the two eyes consists simply of a translation. If the mapping includes both a translation and a rotation, attributable perhaps to eye rotations, or involves a nonlinear transformation, the mapping technique must be extended to include multiple recording locations within the LGN.
Because the subregion correspondence model is based on the developmental model of Erwin and Miller (1998), physiological CRPs for the cat should be defined here in terms of firing correlations in the LGNs. The mean RF position of a set of nearby ON- (or OFF-) center cells in a contralateral eye layer of the LGN will occur at some point on the contralateral eye’s retina. The physiologically corresponding point on the ipsilateral eye’s retina is defined here as the mean position of the RFs of the ON- (or OFF-) center cells in the ipsilateral eye LGN layer whose activities had been most strongly correlated with those of the chosen contralateral eye cells during the development of cortical RFs. In the developmental model, the two eyes’ RFs show subregion correspondence under this definition of physiological CRPs. The LGN location-based method described above probably gives a good estimate of physiological CRPs defined in the correlation-based way and is obviously easier to assess and so is probably the best definition for practical tests of subregion correspondence.
Simplifying assumptions. In this article, we will calculate disparity tuning curves of cells from a description of their two eyes’ RFs. We will also compare our predictions of RF structure with data from experiments that map right- and left-eye RFs independently. To do these things, we must make several simplifying assumptions.
We first assume that experimental RFs are initially mapped on the surface of a flat screen on which stimuli are presented (Fig.2 a). A set of points in any small, local region of the left retina will map to some set of points on this screen (Fig. 2 a, top left shaded region). The set of corresponding retinal points (by whatever definition) on the right retina will map to some other set of points on the screen (Fig.2 a, top right shaded region). We assume that these two sets of points can, to an acceptable degree of approximation, be made to coincide on the screen (Fig. 2 a, bottom shaded region) through a translation and rotation of one or both sets. Then points in both eyes can be described in a common coordinate system, here labeled (H, V).
This assumption will not be valid over large areas of the visual fields, even for the simple geometrical definition of CRPs. In this case, CRPs from the two eyes map to common points in visual space only along a locus of points in visual space called the horopter. For fixation within the horizontal meridian, the horopter has the shape of a circle (Aguilonius, 1613), commonly referred to as the Vieth–Müller circle, passing through the point of fixation. Horopters defined in terms of physiological or psychophysical criteria approximately coincide with this circle, although there are systematic deviations, which can be explained by the difference between physiological and geometrical CRPs (see references in previous section). The errors incurred by treating the horopter as coincident with a flat screen have been estimated and found to be negligible, for the purpose of comparing center positions of RFs, for cells in the central visual field horizontally out to ∼10–15° eccentricity (Barlow et al., 1967). (The range of eccentricities over which the errors are negligible for the more precise measurements needed to study the placement of subfields within RFs has not been calculated, to our knowledge, but is likely to be smaller).
Finally, we assume that effects of changes in eye positions and rotations that occur during the experiment have been eliminated, or at least that the data can be divided into subgroups within which all such effects are eliminated. Such subgroups could correspond to data from cells measured either simultaneously or along with measurements of the binocular RFs of a constant set of reference cells that allow correction for eye movements, or more generally, measured during a period when the eye positions are known to have remained stable, if this can be established. We will assume throughout this article that corrections have been made for any remaining movements, although we do not mean to minimize the difficulty in practice of achieving this.
Within the limits of these simplifying approximations, the results of any experiments should differ from one another only in ways that can be explained by differences in the relative rotations and translations applied to the left- and right-eye RFs, even when those experiments used different definitions of CRPs.
Set of cells considered. The subregion correspondence model, like the other models of binocular organization considered here, applies only to cells whose responses to stimuli can be well-described by a linear sum followed by a static nonlinearity: the response is determined simply by summing the input to the two eyes, followed by application of a threshold function. Here, the input to a single eye is given by linear summation of the luminance pattern presented to that eye, weighted by its RF. The ability of such a simple response model to reasonably describe responses of simple cells in primary visual cortex to monocular (DeAngelis et al., 1993) or binocular stimuli, including their disparity tuned responses, has been demonstrated previously (Ferster, 1981; Ohzawa and Freeman, 1986; Nomura et al., 1990; Zhu and Qian, 1996).
This response model requires that the strength of inhibition induced by a stimulus in either eye be approximately equal to the strength of excitation that would be induced by a stimulus of the opposite polarity in the same eye. For example, if a stimulus of one contrast gives inhibition, the linear sum in the response model requires that a reversed contrast stimulus must give excitation of equal strength. Thus, this response model excludes those cells that show weak or no excitatory responses from one eye (e.g., ocular dominance classes 1–2, 6–7, on the traditional 1–7 scale), yet show strong inhibition from that eye (Sillito et al., 1980; Ferster, 1981; Ohzawa and Freeman, 1986; LeVay and Voigt, 1988). None of the models under consideration here makes predictions about the RFs or tuning properties of such cells.
We shall also assume, consistent with much experimental evidence, that each eye’s RF can be well-described by a Gabor function (Jones and Palmer 1987a,b; DeAngelis et al., 1993), and that these functions for a given cell have the same preferred orientation and spatial frequency for each eye (Skottun and Freeman, 1984; DeAngelis et al., 1995a;Ohzawa et al., 1996).
We will ignore the time dependence of RFs. However, our results should apply to cells with space–time inseparable RFs, that is, cells for which the phases in each eye’s RF vary as a function of the time between stimulus and response (DeAngelis et al., 1993, 1995b), as well as to space–time separable RFs. The position of each RF’s Gaussian envelope and the interocular phase shift are each approximately constant in time for most V1 cells, including cells with space–time inseparable RFs (DeAngelis et al., 1995a; Ohzawa et al., 1996). That is, interocular position and phase shifts tend not to vary in time. Thus, it makes sense to speak of each cell having a definite interocular position and phase shift, even for space–time inseparable RFs. It is the distribution across cells of these shifts and their relationship that will define the models we study. Locations of peaks and troughs in disparity tuning curves given by Equation 6 (below) should not be affected by including time dependencies that do not alter interocular position or phase shifts (Ohzawa et al., 1996). Thus, our conclusions about disparity tuning peaks for cells with one or another relationship between interocular position and phase shifts should also apply to cells with time-dependent RFs. This is supported by the fact that, when responses can be evoked by stimuli moving in opposite directions, the preferred disparities of most cells do not seem to change, although the magnitudes of the responses can be affected (Poggio and Talbot, 1981; Poggio et al., 1988).
Mathematical description of binocular RFs. The left- and right-eye RFs of any cell i can be represented by functionsR Li and R Ri, such that the monocular inputs are given by the sum of point-by-point multiplications in visual space of the stimulus, S, and the RFs. We let positive and negative values of these functions represent, respectively, subregions showing excitation by ON and OFF stimuli and showing opponent (“push-pull”) inhibition by OFF and ON stimuli (Palmer and Davis, 1981; Ferster, 1988; Hirsch et al., 1998).
After the necessary translations and rotations have been applied to measured RFs to bring CRPs into alignment on the stimulus screen, the RFs of any individual cell i can be described most simply using an (x i,y i) coordinate system tailored to that cell (Ohzawa and Freeman, 1986). The center of this coordinate system is aligned with the center of the left-eye RF (Fig. 2 b), with the x i- and y i-axes oriented perpendicular and parallel, respectively, to that cell’s preferred orientation. This yields the following description of left- and right-eye RFs: Equation 1 Equation 2The variables ςLi and ςRi determine the width of the left- and right-eye RFs, respectively. The spatial modulation of ON and OFF subregions is modeled by sinusoids with spatial frequency f i and with phases φLi and φRi in the left and right eyes, respectively. The difference between the right-eye center and left-eye center locations is called the position shift: (Δx i, Δy i). Likewise, the phase shift is defined as Δφi = φRi − φLi. Typically, x i and y i are measured in degrees in visual space, whereas f iis in cycles per degree of visual space, and φRi and φLi are measured in radians such that Δφi/(2πf i) gives degrees in visual space.
Cell parameters. We separately simulate experiments performed in central cat area 17 (0–5° eccentricity) and more peripheral area 17 (8–12° eccentricity). In each case, we use spatial frequency and position shift distributions derived from published data for these regions (Table1). We also simulate results from some experiments in which RFs were mapped by reverse correlation (DeAngelis et al., 1991; Anzai et al., 1997) in area 17. Because the eccentricities were known only to be within the central 15° (R. Freeman and I. Ohzawa, personal communication), we must test the effects of using position shift data gathered at various eccentricities together with a distribution of spatial frequencies fit directly to the experimental data. Parameters used for all three types of simulation are given in Table 1.
We specify distributions of interocular position shifts in the (H, V) coordinate system, in which physiological CRPs in the two retinae have identical coordinates. We choose theH and V axes to represent the horizontal and vertical directions, respectively. Defining this coordinate system during physiological measurements requires correction for the arbitrary aim of the two eyes with respect to the stimulus screen and possible rotations of the two eyes about their visual axes (Barlow et al., 1967;von der Heydt et al., 1978; Cooper and Pettigrew, 1979) (Fig. 2).
We choose position shifts ΔH i and ΔV i randomly from normal distributions with independent SDs, with parameters based on measurements by Joshua and Bishop (1970). These measurements appear relatively consistent with other measurements in cat area 17: the distribution of position shifts is approximately isotropic in the central visual field (Nikara et al., 1968; Joshua and Bishop, 1970; von der Heydt et al., 1978) but appears to be wider in the horizontal direction than the vertical direction more peripherally. Barlow et al. (1967) found a 3:1 ratio of horizontal to vertical position shift widths for cells between 5 and 15° eccentricity. This is somewhat larger than the 2.3:1 ratio found byJoshua and Bishop (1970) at 8–12°; they argued that the combination of data from a large range of eccentricities in the earlier study may have caused an overestimation of the anisotropy. von der Heydt et al. (1978) presented data from peripheral cells in two cats. In one of these (cat 12), they also found a bias toward larger horizontal than vertical position shifts for cells at 5–10° eccentricity. Data from the other cat (cat 7) are difficult to interpret because both central and peripheral cells were included.
We choose preferred orientations θi randomly from a uniform distribution between 0 and π. Here, 0 represents vertical preferred orientation (y i-axis parallel to V-axis), and orientation angle increases with counterclockwise rotation. Then from Figure 2 b, the position shifts may be equivalently expressed as: Equation 3
To choose spatial frequencies, we let μi = −1n and choose the μi randomly from probability distributionsP(μi) given by normal distributions with means μ and SDs ς, 𝒩(μ, ς). These parameters are chosen to approximately fit observed distributions. For clarity, the frequencyf i =e −μi cycles/deg corresponding to the mean of each distribution is also given in Table1.
To choose RF widths, we first assume that the distribution of the numbers of subregions in the left- and right-eye’s RFs,N Li and N Ri, is fairly constant across location in both areas 17 and 18. We chose these values randomly from a uniform distribution over the region shown in Figure 3 b, which was fit by hand to the data of Ohzawa et al. (1996) (Fig. 3 a). As in that article, the number of subregions was defined as twice the spatial frequency multiplied by the width of the RF Gaussian envelope at 5% of its maximum height, or N = 9.79fς. After choosing N Li andN Ri, we use this formula to assign values to ςRi and ςLi.
Left-eye phases, φLi, were chosen randomly from a uniform distribution between −π and π. For simulations of the subregion correspondence model, right-eye phases, φRi, were determined from φLi and Δx i, as explained later. For simulations of other models, right-eye phases were drawn randomly from the same distribution as left-eye phases.
Computation of disparity tuning curves. In measurements of disparity tuning, the stimuli presented to each eye are identical, except for a spatial offset, or disparity. The stimuli are usually thin bars or luminance gratings aligned with the cell’s preferred orientation, the y i-axis, and swept through both RFs at a constant disparity, D, measured along the perpendicular x i-axis. Thus at any time, the left- and right-eye stimuli may be represented asS(x i, y i, t) and S(x i + D,y i, t), where positive and negative values indicate regions of high or low luminance relative to the mean. The input to the cell is given by Equation 4The response is given by applying a threshold function,F i, with thresholdz i: Equation 5The disparity tuning curveT i(D) of cell i is given by the summed response of the cell for all times, t, during the sweep of a stimulus at disparity D (Ferster, 1981; Freeman and Ohzawa, 1990; Nomura et al., 1990): Equation 6
Our focus will be on the locations of the peaks and troughs in the disparity tuning curves. Thus we may choose several of the above parameters somewhat arbitrarily, because they do not significantly affect these locations. We let the stimulus S always be a thin light bar (0.05° wide along the x-axis) extended along the preferred orientation (y-axis). For Gabor-type RFs, the exact width of the bar stimulus has little effect on the shape of the tuning curve as long as the bar width is less than the width of a single ON or OFF subregion of the RFs. The threshold,z i, for each cell is set to 40% of the maximum value of its input, I i(D,t), across all D and t. Using this definition, both stimulus intensity and RF intensity (i.e., a gain multiplying bothR Li and R Ri) are irrelevant, because these would simply multiply the disparity curve without otherwise altering it. Setting the thresholds at a higher (or lower) percentage of the cell’s input lowers (or raises) the baseline response in the disparity tuning curves and varies the relative magnitudes of peaks and troughs but has little effect on the peak locations.
Note that, by Equations 1 and 2, our simulations consider only the case of circular, rather than elliptical, RFs, and the two eyes’ circular Gaussian envelopes have equal integrated strength. Modifying these details, although maintaining an approximately binocular cell, could affect whether portions of the responses are suprathreshold or subthreshold, but would have little effect on the positions of tuning curve peaks.
Statistical tests. We performed the statistical tests described in Results accompanying Fig. 8 as follows. We generated 50,000 points from the distribution predicted by subregion correspondence (100 such points shown in Fig. 8 c) and 50,000 points from the distribution predicted by the unconstrained hybrid model (100 such points shown in Fig. 8 e). For each distribution, 5000 points were chosen as the base distribution to compare with, and the remaining 45,000 points were used to generate 1551 sets of 29 points each. We then used the routine “ks2d2s” (Press et al., 1992, page 649) to compute D, the two-dimensional, two-sample Kolmogorov–Smirnov statistic, between (1) the 29 points in the experimental data set of Anzai et al. (1997) (Fig.8 f) and each of the two base distributions; and (2) each of the 1551 data sets from a given distribution and the corresponding base distribution. The significance of the outcome was determined by a Monte Carlo method [as recommended by Press et al. (1992), because the alternative is to use a somewhat distribution-dependent formula]: for each distribution, we determined the number k of the N = 1551 simulations that had a D greater than or equal toD exp, the value of D found for the experimental data tested against the same distribution. One can then compute (see ) that, for the given probability distribution, the probability of finding D ≥D exp is given by P(D≥ D exp‖k,N) = (k + 1)/(N + 2). We thus state that the probability of the hypothesized distribution given the experimental data is ≤(k + 1)/(N + 2). We foundk = 41 for the subregion correspondence distribution and k = 0 for the unconstrained hybrid distribution. The resulting probabilities (0.0270 for subregion correspondence and 0.00064 for uncorrelated hybrid) agreed reasonably with the probabilities that emerge from the empirical but distribution-dependent formula of Press et al. (1992) that is based simply onD exp (0.0245 for subregion correspondence and 0.0019 for unconstrained hybrid). Data points in Figure 8 fwere determined from the original figure by hand (using the Matlab function “ginput”).
For the tests of the significance of correlation under subregion correspondence in the same section of the paper, all 50,000 data points were used, yielding 1724 data sets of 29 points each. Correlation coefficient and its significance were computed using the routine “pearsn” from Press et al. (1992, page 638).
In this article, we propose that the left- and right-eye RFs of binocular simple cells are related by “subregion correspondence.” By this we mean that, where the two eyes’ RFs overlap, ON subregions in one eye will overlap only with ON subregions in the other eye, and similarly for OFF subregions (Fig. 1 c). (More precisely, this will be true when each RF is expressed in a coordinate system in which physiologically corresponding points on the two retinae coincide; see Materials and Methods). If these RFs can be described by Gabor functions, as in Equations 1 and 2, then this hypothesis requires that the sinusoidal portions of these functions for a given celli be equivalent: Equation 7Here, Δx i is the position shift between left- and right-eye RFs along the axis perpendicular to the preferred orientation, and f i is the cell’s preferred spatial frequency. The phases in the two eyes, φLi and φRi, relative to their RF centers may be different, but when Equation 7 is satisfied, we say that the left- and right-eye RFs have the same absolutespatial phase (Fig. 4). The difference in relative phases, Δφi = φRi − φLi, is referred to as the cell’s phase shift.
Equation 7 requires that the position shift and the phase shift of any cell i must obey: Equation 8for some integer n. Thus our model can, in principle, be directly tested simply by plotting Δφi againstf iΔx i for a set of measured RFs. However, such a direct test is only possible if the absolute position shifts, Δx i, can be determined; this in turn requires transformation from the coordinates of RFs measured experimentally to coordinates in which physiologically corresponding points in the two eyes are aligned (see Fig. 2). Because determining this transformation remains very technically challenging, we first examine several indirect tests that have been performed or that can be performed more easily.
In these tests, we compare the predictions of this model against the “unconstrained hybrid” model, which proposes that position and phase shifts both exist but are independently distributed. Note that each of these models applies only to binocular cells for which the responses to stimulation can be described as the thresholded sum of the two eyes’ input, where each eye’s input is the product of a Gabor function RF and the visual stimulus (Equations 1, 2, 4-6). Neither model addresses the binocular RF relationships or disparity tuning properties of other types of cells, such as those for which the input from one eye is primarily inhibitory.
We show that most experimental data gathered so far are equally consistent with either model. However, the observed distribution of peaks of disparity tuning among binocular simple cells is consistent only with the subregion correspondence model. Additionally, we examine a trend observed in the distribution of phase shifts versus preferred orientation. Although either model is consistent with this trend, we show that only subregion correspondence allows a developmental explanation for the origin of this trend that does not require an explicit dependence of RF phases or position shifts on preferred orientation. Because all of this evidence is indirect, we then return to the question of how the difficulties involved in a direct test of Equation 8 might be overcome and argue that this can be achieved by measuring groups of three or more binocular RFs simultaneously.
Disparity tuning curves
Tuned and nontuned cells
Experimentally observed cells with disparity-modulated responses to binocular stimuli can be grouped into tuned and nontuned categories. Tuned cells have response curves with narrow inhibitory and excitatory regions; they include the so-called “tuned excitatory,” “tuned near,” “tuned far,” and “tuned inhibitory” cells. (Examples of these types are shown below). Nontuned cells are those that don’t meet this description; they include traditional “near” and “far” cells and any other cells that have broad inhibitory regions in their response curves. Here, broad and narrow should be considered relative to the typical size of an ON or OFF subregion in RFs of simple cells near the recording location.
Tuned cells tend to receive different types of input than nontuned cells in cat (Fischer and Krüger, 1979; Ferster, 1981; LeVay and Voigt, 1988; Lepore et al., 1992) and also in macaque (Poggio and Fischer, 1977; Poggio et al., 1988). Tuned cells, most of which are of the tuned excitatory type in cat, tend to be binocular. Many have simple-cell RFs that can be described by Equations 1 and 2. Nontuned cells tend to be monocularly driven; they respond weakly or not at all to the nondominant eye alone but show modulated response to that eye when the dominant eye is also stimulated. The input from the nondominant eye is usually inhibitory across its full RF, which does not show simple-cell organization. Equations 1 and 2 cannot describe such an RF, because any inhibitory response to ON (or OFF) stimuli must be balanced by an excitatory response to OFF (or ON) stimuli, and because the ON and OFF regions in each RF must be of the same width.
Response curves matching those of both tuned and nontuned cells can be generated by Equations 4-6 if the appropriate right and left eye RFs are used. However, the models we are concerned with, the subregion correspondence and unconstrained hybrid models, as well as pure position-based and phase-based models, all describe RFs by Equations 1and 2 and thus only describe tuned binocular cells.
Individual tuning curves
We begin by examining the predicted disparity tuning curves of several model binocular cells obeying subregion correspondence, to demonstrate their possible shapes and the limitations on the placements of their peaks relative to zero disparity. Tuned response curves with a wide variety of shapes can be produced just by varying the relative phases of the left- and right-eye RFs, even if no position shifts are included (Freeman and Ohzawa, 1990; Nomura et al., 1990). Yet position shifts and phase shifts both occur in binocular simple cells (Anzai et al., 1997). From Equation 6, position shifts affect only the placements of the tuning curves along the disparity axis but do not change their shapes. The subregion correspondence model differs from the unconstrained hybrid model only in that it requires a specific phase shift to accompany each position shift. Thus both models make identical predictions about the possible shapes of tuning curves of binocular cells and are only distinguishable based on their predicted distributions of preferred disparities. We will show that only the subregion correspondence model can well explain the distributions observed experimentally in cats.
First, consider a binocular cell with approximately four subregions in each eye’s RF. If there is no position shift, or only a small position shift, the phase shift required by Equation 8 will always result in a disparity tuning curve with a peak at D = 0. The example shown in Figure 5 a has two additional peaks. One or more of these side peaks occurs whenever at least one monocular RF has two or more excitatory subregions and the cell’s firing threshold is not too high. Side peaks can be located only at integral multiples of the preferred stimulus wavelength, 1/f i, of cell i.
A larger position shift can produce a cell where the response is largest at one of the nonzero peaks (Fig. 5 b), similar to some tuned near cells seen in monkey cortex (Poggio et al., 1988). The required position shift is not always large. For example, the position shift in Figure 5 b is only 0.4/f i.
Although cells with multiple RF subregions can have disparity tuning peaks only at D = 0 and at integer multiples of the wavelength, 1/f i, the same is not true of all cells. For cells with RF width approximately the same as the width of a single subregion (N ≈ 1), the peak of the tuning curve can be shifted a small distance from these values, as in Figure5 c.
Tuning curves whose most prominent feature is an inhibitory region can also be produced. The cell in Figure 5 d would be classified as tuned inhibitory because its firing is suppressed near a particular disparity.
The categories of tuned cells exist along a continuum with somewhat blurry boundaries. For example, simply by varying the position shift, the tuning curve of the cell in Figure 5 d could be smoothly deformed from its tuned inhibitory shape into either a tuned excitatory or tuned near form. Lowering the cell’s firing threshold would raise its tuning peaks relative to the baseline, making it even more likely to be classified as tuned excitatory or tuned near based on its excitatory peaks.
Likewise, responses of tuned cells can sometimes resemble those of nontuned cells. The possibility of confusion is greatest for cells with few subregions. In Figure 5, e and f, two tuned cells are shown whose responses resemble nontuned far cells in that they are each inhibited by stimuli with positive (near) but not negative (far) disparities. We know that these cells are tuned, because their binocular RFs are described by Equations 1 and 2. Thus we could classify them both as tuned inhibitory, or else we could emphasize their slight differences by classifying Figure 5 e as tuned excitatory and Figure 5 f as tuned near. But if cells like these were encountered experimentally, they might be classified as far cells because of the predominance of inhibitory responses. We have labeled these cells as “far-like” to emphasize the possible ambiguities in classifying them.
Tuned cells with few subregions in one eye, like Figure5 d–f, might even appear to be monocular physiologically, especially in those studies that used only bright stimuli (Fischer and Krüger, 1979) rather than both bright and dark stimuli (Ferster, 1981).
In summary, the following characteristics are shared by all models described by Equations 1, 2, and 4-6, with or without subregion correspondence: (1) the tuning curves are of the “tuned” types (tuned excitatory, tuned inhibitory, tuned near, or tuned far), and cannot be of the nontuned types (near or far); (2) tuning curve shapes fall along a continuum, rather than forming distinct classes; (3) cells with multiple RF subregions can have disparity tuning curves with multiple excitatory and/or inhibitory regions; and (4) any excitatory or inhibitory region in a tuning curve can be no wider than the ON and OFF subregions in the cell’s RFs.
Although the shapes of the tuning curves do not differentiate between the models, the models can be distinguished by their predictions as to the locations of peaks and troughs in the tuning curves. The subregion correspondence model places several limitations on these locations: (1) excitatory peaks can only occur near D = 0 and other small integer multiples of 1/f i, (Fig. 5 a,b,d), except that in single-subregion cells, these peaks may be shifted a small amount away from these disparities (Fig. 5 c); (2) the largest excitatory peak can occur far from D = 0 only if there is a large position shift (Fig. 5 b,f); and (3) inhibitory portions of the tuning curve cannot occur atD = 0 or other integral multiples of 1/f i. We can thus test the subregion correspondence model by examining whether these restrictions on the placements of the tuning curves relative to zero disparity apply to real tuned binocular simple cells.
The distribution of peak disparities
Testing the predictions concerning individual cell disparity tuning curves is complicated by the difficulty of experimentally determining absolute disparity, i.e., of determining the zero disparity point. Tests may be more easily made of the distributions of peak disparities predicted by alternative models, relative to some arbitrary but consistent zero.
Three studies have measured such distributions in the cat. They all reported that most or all binocular cells had “tuned excitatory” response curves with peaks restricted to a very narrow range of disparities. Each of the two earlier studies (Fischer and Krüger, 1979; Ferster, 1981) measured disparities relative to a zero found by aligning the RF envelopes of a cortical reference cell. The zero disparity point was not consistent, because a different reference cell was typically used for each cell tested. Thus, those studies revealed only that the peak disparities in cat were narrowly distributed (that is, differences between peak disparities of the test cell and the reference cell were very small). LeVay and Voigt (1988) showed further that this distribution was centered on zero, where zero was defined consistently for all cells by matching the RF locations measured through each eye at a site on the A–A1 border in LGN. Similarly,Pettigrew and Dreher (1987) report that cells in cat area 19, which receives input from the C layers of the LGN, tend to show tuned excitatory response curves with peaks corresponding to zero disparity as defined by matching positions of monocular RF across the LGN C1–C2 border.
All models based on Equations 1, 2, and 4-6 can predict the responses only of tuned binocular cells, as described above. Only the subregion correspondence model generates a distribution consistent with the experimental measurements just described, in which most of those tuned cells give their greatest response near zero disparity.
Figure 6 A shows the distributions of peak disparities predicted by three models for a set of tuned binocular cells simulated with parameters chosen to be representative of ocularly balanced cells in the central visual field of cat area 17 (see Table 1, Materials and Methods). In the subregion correspondence model, this distribution is bimodal, with a narrow peak near D = 0, a near absence of cells tuned to 0.25–0.6°, and a small proportion of cells tuned to larger disparities (Fig. 6 A, a). Sixty-eight percent of the tuning curves have peaks within 0.25° of zero; most of these curves fall clearly in the tuned excitatory class, but any cell whose largest excitatory response is near zero is included. The secondary peak in the histogram represents the minority of cells with their largest response nearer to D = ±1/f i than to zero, such as Figure 5, b and f.
Models not restricted by subregion correspondence produce a unimodal distribution with a single broad peak. In a purely position-based model this peak would have the same width as the distribution of position shifts, but this model is clearly at odds with the demonstrated presence of phase shifts. In a purely phase-based model, which is at odds with experimental demonstrations of position shifts, the width of the peak would depend on the preferred spatial frequencies; lower spatial frequencies give broader distributions of peak disparities. Even for the spatial frequency distribution measured in central area 17, the highest spatial frequency distribution in Table 1, the distribution of tuning curve peaks is broader than the experimental data (Fig. 6 A, b) (only 52% of cells have preferred disparity ≤0.25°). Adding position shifts without also adding the restriction of subregion correspondence produces a still broader distribution (Fig. 6 A, c) (only 33% have preferred disparity ≤0.25°).
The distribution of peak disparities measured by Ferster (1981) in the central visual field of cat area 17 is reproduced in Figure6 B. Results in area 18 were qualitatively the same, with slightly larger preferred disparities. The central set of unfilled points consists primarily of binocular tuned excitatory cells: 77% of these cells in areas 17 and 18 were binocular. The outer ring of filled points consists mostly of untuned monocular cells; only 17% of these in areas 17 and 18 were binocular. Among these binocular cells some of the simple cells may have been tuned cells, like those in Fig. 5,b and f.
The experimental data are well matched to the predictions for binocular tuned cells of the subregion correspondence model (Fig.6 A, a), and not of the other models, in three respects. First, the tuned excitatory binocular cells are clearly segregated from any other binocular tuned cells by a gap in the distribution of peak disparities. Second, most binocular cells fall into this tuned excitatory class. Third, and most significantly, the distribution of preferred disparities for the tuned excitatory cells is very sharply peaked about zero.
The precise percentage of tuned cells found in the central peak is somewhat dependent on the parameters we used in our simulations for the distributions of position shifts and spatial frequencies. If a narrower (or wider) distribution of position shifts were used, the proportion of cells in the central peak of Figure 6 A, a, would rise (or fall). The value of 68% found for subregion correspondence with the parameters assumed is somewhat smaller than found in the experimental data, which itself has only a small sample size. In areas 17 and 18 together, 11 disparity-sensitive binocular simple cells were measured, of which 9 were in the tuned excitatory class. In total, there were 46 disparity-sensitive binocular cells in areas 17 and 18, including both simple and complex cells, of which 83% were in the tuned excitatory class.
The stronger prediction is that the central peak is very narrow. The experimental central peak has an SD of only 0.15°. The central peak predicted by the subregion correspondence model has an SD of only 0.10°, and this value does not grow larger with increases in the range of position shifts. (Increases in ON and OFF subregion width relative to RF width could increase the width of this peak, because more cells would come to resemble Fig. 5 c.) On the other hand, reducing position shifts in the unconstrained hybrid model all the way to zero, resulting in a purely phase-based model, still leaves a very broad peak (Fig. 6 A, b, SD of 0.41°). This peak can be made more narrow, but only by increasing the spatial frequencies to unreasonably high values; each doubling of all the spatial frequencies would cut the peak width only in half.
Although the sample of simple cells in the study of Ferster (1981) is small, the binocular complex cells might also be relevant. Their peak disparities seem as narrowly distributed as those of simple cells. This is consistent with the idea that they receive their dominant input from simple cells (Hubel and Wiesel, 1962; Martinez and Alonso, 1998) and largely inherit their disparity tuning from this input. If this were true, then the complex cell as well as simple cell data would provide evidence as to the distribution of peak disparities of binocular simple cells, evidence that is consistent only with the subregion correspondence model.
LeVay and Voigt (1988) reported a broad, unimodal distribution of preferred disparities, also considering both monocular and binocular, simple and complex cells. The data from the binocular cells alone were much more tightly clustered around zero disparity, as we predict, but also did not show obvious signs of bimodality. These differences from the data of Figure 6 B may be attributable to the fact that LeVay and Voigt (1988) combined data from areas 17 and 18 over an unknown range of eccentricities. For a fixed number of subregions, the distribution of preferred disparities should scale with subregion size (i.e., inversely with spatial frequency), which increases with eccentricity and, for a fixed eccentricity, is larger in area 18 than area 17 (Movshon et al., 1978b). Hence, even if the distribution at each eccentricity had the bimodal structure of Figure6 B, combining data from multiple eccentricities could wash out this structure to yield a unimodal distribution.
We have focused here on the distribution of the tuning peaks of disparity tuning curves. We have not examined the distribution of troughs. Although tuned inhibitory cells have occasionally been reported in cat (Lepore et al., 1992), no data are available on the absolute disparities at which they give their peak inhibition.
Dependence of phase shifts on orientation
The distribution of phase shifts, Δφi, observed in simple cells in cat visual cortex appears to be related to those cells’ preferred orientations, θi (DeAngelis et al., 1991, 1995). Cells with preferred orientations near horizontal tend to have small phase shifts, whereas vertical-preferring cells show the full range of possible phase shifts (Fig.7 a). Anzai et al. (1997)observed a similar, but much weaker, relationship. It has been argued that such an anisotropic distribution may be useful in the computation of disparity from simple cell responses (DeAngelis et al., 1991).
The relationship shown in the experimental data of Figure 7 aincludes no information about position shifts and thus provides no direct basis for distinguishing subregion correspondence from the other models. Any of the models can “explain” the result by simply assuming it, that is, by assuming that the distribution of phase shifts directly depends on preferred orientation. However, subregion correspondence also allows a simpler explanation, in which there is no direct dependence of the distribution of RF properties on preferred orientation. As we shall show, this explanation leads to a prediction that the anisotropy in Figure 7 a should vary with recording location, depending on the local distribution of preferred spatial frequencies and the local relative distributions of horizontal and vertical position shifts.
Under subregion correspondence, phase shifts, Δφi, and position shifts, Δx i, are linearly related by Equation8. The relationship of Figure 7 a would then imply that the distribution of Δx i must be wider for vertical-preferring cells than for horizontal-preferring cells. This could be achieved in a variety of ways. One possibility is that the distribution of position shifts directly depends on preferred orientation; for subregion correspondence, this is equivalent to the assumption that the distribution of phase shifts directly depends on preferred orientations. However, subregion correspondence also allows the following alternative explanation: the data can be accounted for if position shifts in the horizontal direction, ΔH i, are simply distributed more widely than vertical position shifts, ΔV i, independent of preferred orientation. This follows from the fact that from Equation 3, Δx i is measured parallel to the V-axis for horizontal-preferring cells (θi = ±π/2) but parallel to the H-axis for vertical-preferring cells (θi = 0).
Joshua and Bishop (1970) reported such an anisotropic distribution of position shifts in cat area 17, with a wider distribution of horizontal than vertical position shifts, at eccentricities of 8–12° near the horizontal meridian; whereas in the central 4° of the visual field, they reported an isotropic distribution of position shifts (see Table1, Materials and Methods). In Figure 7 b, we simulate a population of cells with position shifts drawn from the distribution measured at 8–12°. Preferred orientations and spatial phases in one eye were drawn from a uniform distribution. Preferred spatial frequencies were drawn from a distribution that approximates the measured distribution for the cells in Figure 7 a (see Materials and Methods). The phase shifts were then calculated from Equation 8. The simulated data qualitatively reproduce the experimentally observed trend that the range of phase shifts increases as preferred orientation goes from horizontal to vertical.
The actual range of position shifts for the cells in Figure7 a are unknown; so too are the eccentricities, except that they were rarely if ever larger than 15° (R. Freeman, personal communication). For comparison, we show in Figure 7 c–d the results of simulations using data fit to independently measured distributions of position shifts and spatial frequencies from central (0–4°) and more peripheral (8–12°) parts of area 17 (see Table 1, Materials and Methods). Based on these measured distributions, the alternative explanation allowed by subregion correspondence predicts that phase shifts should be evenly distributed as a function of preferred orientation for central locations (Fig. 7 c) and only weakly dependent on preferred orientation at the more peripheral locations (Fig. 7 d).
The stronger anisotropy seen in Figure 7, b versusd, results simply from the lower preferred spatial frequencies used in Figure 7 b. All other parameters, including the distribution of position shifts, were identical. The spatial frequency distribution used in Figure 7 bapproximates that actually observed in the data of Figure7 a, whereas in Figure 7 d, this distribution is taken from independent measurements at 8–12° (Movshon et al., 1978b). Thus, Figure 7 b shows that the lower preferred spatial frequencies (wider subregions) observed in the cells reported in Figure 7 a could be responsible for the strong anisotropy observed. Note that if position shifts and subregion sizes are all scaled by a common factor, the distribution of phase shifts versus preferred orientation is not changed.
In summary, varying degrees of anisotropy in the distribution of phase shifts versus orientation can be created by this mechanism. The precise degree will depend in definite ways on the distributions of position shifts and of spatial frequencies of the measured cells. If these distributions can be measured along with measurements of anisotropy, then simulations as in Figure 7 b–d can be used to test whether this mechanism is operating.
In particular, if we assume that the position shift data of Joshua and Bishop (1970) and the spatial frequency data of Movshon et al. (1978b)are approximately correct, then under this mechanism the relationship of Figure 7 a should not be present in central visual fields of area 17 (Fig. 7 c). Finding a lack of this relationship in central visual fields thus would provide indirect evidence for subregion correspondence, because the other hypotheses have no natural explanation for an eccentricity dependence of this relationship.
However, to directly test this explanation, it would be necessary to measure data such as Figure 7 a while simultaneously measuring the distribution of position shifts. This is equivalent to a direct test of Equation 8, which we now consider.
Joint measurement of position and phase shifts
To directly test the predictions of the subregion correspondence model, one needs to estimate f i, Δx i, and Δφi for several binocular simple cells and compare their relationship with that predicted by Equation 8. For simple cells, f iand Δφi may be easily measured, for example, by fitting Equations 1 and 2 to the left- and right-eye RFs determined by reverse correlation.
The determination of Δx i is more difficult, because it requires that we find the necessary rotation and translation operations to bring into alignment physiologically corresponding points measured in the right and left eyes. For any individual cell,i, it is always possible to find some such set of operations for which Equation 8 will be true. However, if our hypothesis is correct, there must exist some single choice of rotation and translation operations that, when applied equally to allbinocular simple cells with RFs in a small region of visual space, would bring all (or most) cells into agreement with Equation 8.
We illustrate in Figure 8 a–cthe expected outcomes of attempts to simultaneously measure position and phase shifts under alternative experimental paradigms, assuming that subregion correspondence holds. Figure 8 a shows the data assuming perfect measurements of position and phase shifts for every cell. The data consist of 100 binocular RFs, each assigned random preferred orientations, spatial frequencies, position shifts, and left-eye phases. For each cell, the right-eye phase was assigned to give subregion correspondence. From Equation 8, the points in the illustrated graph, of Δx versus Δφ/2πf, should lie along the diagonal. Some points are off the diagonal, however, because we have expressed all phase shifts in the range −τ ≤ Δφ ≤ π before plotting the data, as is done in experimental measurements; measurements of phase shift are always ambiguous modulo 2π. Cells for which the phase shift given by Equation 8 would fall outside this range give points that do not fall along the diagonal.
In practice, approximations to such an ideal measurement might be made by using an extracellular reference electrode at the border between ocular layers in LGN (Pettigrew and Dreher, 1987; LeVay and Voigt, 1988), where the two eyes’ RFs can be expected to be in correspondence. For each cortical cell studied, the left- and right-eye RFs are simultaneously measured on the reference electrode. The movements needed to align the positions of the two eyes’ RFs at the reference electrode are determined. These same movements are applied to the two eyes’ RFs of the measured cortical cell. Any remaining position shift in the cortical cell’s RF is assigned as the position shift of that cell.
In many experiments, a cortical reference cell is used instead (Ferster, 1981; Anzai et al., 1997). This has the disadvantage that the reference cell is as likely as the measured cell to have a nonzero position shift in its RF, yet the reference cell’s position shift is taken to be zero by this method. Therefore, the reference cell imparts an unknown but constant error to all other position shifts measured from it.
In Figure 8 b, we show how the data of Figure 8 awould look if a single cortical reference cell were used for all measurements. Most points in Figure 8 b lie close to a straight line. The line is displaced from the origin because of the actual but unmeasured position shift of the reference cell. Furthermore, because the orientation of each cell’sx i-axis depends on its preferred orientation θi (see Fig. 2 b), the errors in estimation of the Δx i values induced by the reference cell’s position shift are of different magnitudes for cells with different preferred orientations, thus producing the scatter in the data.
The single reference cell method would require measuring a set of cells, including the reference cell, at the same time, or during an interval in which eye positions were known to be fixed; or, holding the reference cell for a long period and remeasuring its receptive field with each new measurement to correct for eye movements. Experimentally, it has been difficult to measure multiple cells simultaneously or to hold cells for long periods. Recent experiments have instead measured pairs (or occasionally triples) of cells simultaneously and used one cell in each group as a reference cell for the other(s).
In Figure 8 c, we show how the same simulated data would look if measured in pairs, with one cell serving as a reference cell for the other. Each reference cell adds an independent error. As a result, the linear relationship that is known to exist in these data is greatly obscured when data from all cells are plotted together. The data in Figure 8 c are not completely randomly scattered, however, but rather cluster in the top right and bottom left quadrants. Such clustering is not an artifact of the measurement method, because when the same method is applied to simulated cells given a random relationship between position and phase shift (Fig. 8 d), the clustering does not appear (Fig. 8 e). For subregion correspondence, the linear correlation in the data is in general largest when the position shifts tend to be small compared with the spatial periods of the RFs, because this reduces the number of cases in which a phase shift is large enough for the effect of phase ambiguity to be significant. Thus if the spatial frequencies were not changed, the correlation observed in Figure 8 c would decrease (increase) with a wider (narrower) distribution of position shifts.
Only one experiment has attempted to measure position and phase shifts simultaneously (Anzai et al., 1997). Because generally RFs of only two cells could be measured simultaneously, the reference cell pair method was used. (On three occasions, three cells were recorded simultaneously; each set of three cells contributed three distinct cell pairs.) Because the resulting data (Fig. 8 f) do not show a significant correlation between position and phase shifts, it was concluded that no relationship exists between the two.
Comparing the experimental data with simulated measurements on groups of cells where we know that a relationship between position and phase shift either did (Fig. 8 c) or did not (Fig. 8 e) exist, the experimental data do not strongly favor either distribution. On the one hand, the simulated data of Figure 8 c, although broadly distributed, show a significant correlation between position and phase shifts, whereas the experimental data of Figure 8 fdo not. However, the experimental data set is small, containing only 29 points; of 1724 random draws of 29 points from the subregion correspondence distribution, 37.2 and 23.5% showed no significant correlation at the 0.01 and 0.05 levels, respectively. On the other hand, the experimental data are even less likely to come from the unconstrained hybrid model than from subregion correspondence. A two-dimensional form of the Kolmogorov–Smirnov test (Press et al., 1992; see Materials and Methods) shows p < 0.0270 that the data of Figure 8 f come from the distribution predicted by subregion correspondence (Fig. 8 c) but p< 0.00064 that the data come from the distribution predicted by the unconstrained hybrid model (Fig. 8 e).
The reason for this outcome is probably as follows. The cells with smaller phase and position shifts in the experimental data form a diagonal band similar to the distribution predicted by subregion correspondence; these cells constitute a majority of the data.FNaFurthermore, there is a marked dearth of points in the bottom right quadrant, relative to those expected from the unconstrained hybrid model. These relationships render it improbable that the data were generated by the unconstrained hybrid model. On the other hand, the points with larger phase shifts do not obviously follow the subregion correspondence distribution. In particular, those with large negative phase shifts and small position shifts are very improbable under the distribution predicted by subregion correspondence, and there is a lack of points with larger phase shifts in the top right quadrant relative to the number expected under subregion correspondence. These trends render it improbable that the data were generated by the subregion correspondence model.
These trends in the data might suggest modified hypotheses, which could be tested with further data. For example, we might imagine that subregion correspondence holds only for cells with small phase shifts or for cells with a combination of small phase and small position shifts and is violated by unknown mechanisms for larger shifts (cells with such larger shifts might even show some other systematic absolute phase shift). In sum, although the data of Figure 8 f provide evidence against the hypothesis that all binocular simple cells show subregion correspondence, they also provide evidence against the hypothesis that these cells have uncorrelated phase and position shifts. The data are not obviously inconsistent with the hypothesis that many binocular simple cells show subregion correspondence. More generally, these data may motivate more nuanced hypotheses for further testing.
A stronger test of our hypothesis can be conducted by recording data from multiple cells, either simultaneously or during a period when eye drift artifacts can be eliminated (for example, by use of a single LGN reference electrode, as described above). If any translation can be shown to exist that would allow the data to generate a plot like Figure 8, a or b, that is, a translation that would yield simultaneous subregion correspondence in many or all RFs, this would allow us to reject the null hypothesis that position and phase shifts are independent and thus would constitute strong evidence in favor of our hypothesis.
How many cells would need to be recorded simultaneously? Clearly, measuring RFs of cells singly would be insufficient, since for any pair of left- and right-eye Gabor functions it will always be possible to find a position shift that would bring subregions into correspondence. Likewise, pairs of cells are in general insufficient, because it is generally possible to find a position shift that would bring both cells into correspondence. Specifically, cell 1 may be aligned first, and then cell 2 may be aligned by shifting the two eye’s RFs relative to one another along the y 1 axis, which allows subregion correspondence to be maintained in cell 1. If the two cells do not have identical preferred orientations, then such movement varies the relative positions of the subregions for cell 2, allowing correspondence to be achieved in both cells. However, the required movement may take the two eyes’ RFs far apart in visual space; if one adds the plausible constraint of a certain minimal degree of overlap of left- and right-eye RFs, then recordings of groups of two cells with similar preferred orientations may be sufficient to test subregion correspondence.
By measuring groups of three or more binocular RFs simultaneously, the hypothesis that all binocular cells show subregion correspondence can be directly tested without such constraints. We have found, in simulations of recordings of groups of three cells, that plots in the form of Figure 8, made after choosing the position shifts for each group to minimize the distance from the diagonal, form distributions that clearly distinguish between subregion correspondence and a random relationship. However, this method can easily fail to distinguish between a distribution in which a subset of binocular cells display subregion correspondence and one in which no cells do so. Thus, more generally, it will be necessary to record from as many cells simultaneously as possible and to test results against simulated data under a given hypothesis (e.g., that a certain percentage or subset of the cells display subregion correspondence) to provide firm tests of such hypotheses.
Summary of results and predictions
We have examined the hypothesis that binocular simple cells in cat visual cortex obey subregion correspondence: that within the region of overlap of the two eye’s receptive fields, the two eyes’ ON subregions lie in corresponding locations and similarly for OFF subregions. This is equivalent to the existence of a specific linear relationship between interocular phase shifts and interocular position shifts (Equation 8). We have compared this with the hypothesis that interocular phase shifts and position shifts are uncorrelated. We evaluated the two hypotheses against a number of pieces of experimental data:
(1) The strongest support for subregion hypothesis comes from data showing that most binocular cells in cat areas 17 and 18 have “tuned excitatory” disparity tuning curves (Fischer and Krüger, 1979), with peaks narrowly clustered around 0° (Ferster, 1981; LeVay and Voigt, 1988) and clearly separable from the peaks of other binocular cells (Ferster, 1981). The agreement of these data with the predictions of the subregion correspondence hypothesis is striking. The very narrow clustering of preferred disparities of tuned excitatory cells would not result if interocular phase and position shifts were uncorrelated, not even if the position shifts were negligible. We are not aware of any other hypothesis that is consistent with these results.
(2) Either hypothesis can “explain” the result that the distribution of interocular phase shifts is correlated with preferred orientation (DeAngelis et al., 1991, 1995a; Anzai et al., 1997) by simply assuming the result, i.e., by assuming that the distribution of interocular RF properties varies with preferred orientation. Subregion correspondence also allows an alternative explanation that requires no explicit dependence of RF properties on preferred orientation, but that instead requires an anisotropy of position shifts: horizontal position shifts must have a wider distribution than vertical position shifts.
(3) Attempts to directly measure the relationship between interocular position and phase shifts using a paired reference cell technique (Anzai et al., 1997) produce data that are not obviously consistent with either hypothesis. Because the data set is small, it is difficult to draw firm conclusions. However, we pointed out that the cells with small position and phase shifts, which constitute a majority of the data, show a distribution consistent with subregion correspondence. This could suggest an altered hypothesis, e.g., that subregion correspondence might be restricted to cells with smaller interocular phase shifts or smaller phase and position shifts. Further data are needed to resolve this.
We have described a more direct test of the hypothesis, in which the postulated linear relationship between phase and position shifts can be more directly assessed. This requires measuring binocular RFs of groups of three or more cells simultaneously or during a period when eye movement artifacts can be removed. The test becomes more accurate with larger groups of cells with nearby RFs. The prediction of the subregion correspondence hypothesis is that a single translation/rotation of the coordinates of one eye’s RFs relative to those of the other eye should exist that can simultaneously align the subregions of multiple cells.
The explanation provided by subregion correspondence for the relationship between interocular phase shifts and preferred orientations (point 2, above), along with evidence that the required anisotropy in position shifts exists in mildly peripheral but not central cat area 17 (Nikara et al., 1968; Joshua and Bishop, 1970; von der Heydt et al., 1978), yields the prediction that the orientation–phase relationship should not be seen in central cat area 17. More generally, the prediction is that groups of cells with an isotropic distribution of position shifts should show no such relationship. Confirmation of the prediction would provide indirect support for subregion correspondence, because it could more simply explain the result than the other models. A negative result would unfortunately not distinguish between subregion correspondence and other models but instead would simply argue in favor of the explanation that binocular RF properties depend explicitly on preferred orientation, which works equally for all models considered.
Relationship to developmental models
The subregion correspondence hypothesis arose from attempts to explore the general conditions under which activity-dependent, correlation-based plasticity of geniculocortical inputs yields binocular matching of preferred orientations and spatial frequencies (Erwin and Miller, 1996, 1998). This occurs most simply as a byproduct of optimizing some measure of coactivity among inputs. This in turn can be achieved, given appropriate input activity patterns, by binocularly matching the locations of ON and OFF subregions.
The subregion correspondence hypothesis is, however, independent of any developmental model. If experiments reveal that most binocular simple cells indeed show subregion correspondence, this would strongly support the hypothesis that the two eyes develop matched preferred orientations and spatial frequencies simply as a byproduct of matching of the locations of their ON and OFF subregions. It would not, however, pinpoint the particular underlying plasticity rules used or any coactivity measures that may be optimized. [Indeed, the correlation-based framework (Miller, 1990, 1996) is intended to be as independent of such details as possible.] For example, a model using somewhat different activity-dependent rules also appears to produce subregion correspondence (Shouval et al., 1996), although this was not noted by those authors.
In our developmental model, binocular matching of preferred orientations could also be achieved by subregion anticorrespondence, but all other interocular absolute phase relationships were shown to be excluded. As mentioned in the introductory remarks, these alternatives arise from quite different LGN activity structures and so are not likely to codevelop in the direct projections of LGN cells. Subregion anticorrespondence means that, in overlapping portions of the left- and right-eye RFs, the ON subregions in the right eye would always correspond to OFF subregions in the left eye, and vice versa. Cells with RFs of this type would reverse one previous prediction: their disparity tuning curves could include “tuned near” and “tuned far” curves, as well as “tuned inhibitory” curves with peak inhibition at zero disparity, but could not include a “tuned excitatory” curve with peak at zero. The experimental evidence on the distribution of preferred disparities discussed above renders this scenario unlikely to apply to many binocular cells in the cat.
Our model of development uses a very simple, impoverished model of cortical circuitry, because it focuses primarily on correlations in input structures and how they shape receptive field structure. It is conceivable that development under models with more complex cortical circuitry, e.g., chains or loops of cortical excitation and inhibition, might yield cells with more than one interocular absolute phase relationship, although we are not presently aware of scenarios that achieve this. In addition, our developmental model does not yet address the development of space–time inseparable RFs, which would presumably require inclusion of both lagged and nonlagged LGN inputs (Saul and Humphrey, 1992) (see Wimbauer et al., 1997a,b for attempts to generalize the developmental model in this direction) (also see Feidler et al., 1997). It is conceivable that more than one interocular absolute phase relationship could arise in a developmental model of space–time inseparable RFs (also see discussion of space–time inseparable RFs in Materials and Methods).
However, no matter how complex the model, if binocular matching of preferred orientations is achieved by correlation-based competition among geniculocortical inputs, “It seems inescapable … that the set of absolute spatial phases of left- vs. right-eye RFs in individual layer 4 cells should not be consistent with a random distribution: there should be correlations between the absolute phase found in one eye’s RF and that found in the other eye’s, in order for the preferred orientations of the two eyes to become matched” (Erwin and Miller, 1998). This is the most general, robust prediction that results from our modeling of activity-dependent development. The reason for this conclusion is as follows. Correlation-based competition among geniculocortical inputs leads individual cells to receive a set of geniculocortical inputs that maximize input activity correlations. If all interocular absolute phase relationships are equally likely, all must yield input sets that are equally well correlated. This would mean that interocular correlations cannot distinguish between center types; so rotation of one eye’s preferred orientation and subregions with respect to the other’s (while maintaining the same overall RF envelope) would also yield an equally well correlated receptive field (neglecting RF elongation). Thus, if a random distribution of interocular absolute phase shifts is found, additional elements besides correlation-based development of geniculocortical inputs would appear needed to explain the binocular matching of orientation preferences (see discussion of alternatives by Miller et al., 1999).
Given the developmental motivation, it will be helpful for tests of the subregion correspondence hypothesis to identify those cells that are best described by our developmental model. Thus, it will be helpful to note laminar origins of simple cells studied, in case transformations from first-order simple cells (those receiving strong LGN input) to higher-order simple cells, which we have not studied in our models, might yield alternative binocular RF arrangements. It will also be helpful to distinguish space–time separable versus inseparable cells.
Application to other species and systems
Our developmental model is based primarily on the physiology of the connections from LGN to layer 4 of visual cortex in the cat. The results may also apply to other systems in which there is a feed-forward transformation from a layer of monocular ON- and OFF-center cells to a layer that includes binocular orientation-tuned simple cells. Such a system occurs in the visual Wulst in the barn owl (Pettigrew and Konishi, 1976; Pettigrew, 1979) and may occur in the simple cells of primary visual cortex in other mammals, such as ferret (Chapman and Stryker, 1993) and sheep (Clarke and Whitteridge, 1976;Clarke et al., 1976). Although the subregion correspondence hypothesis might apply anywhere, it makes most sense, in terms of the developmental motivation, to test it in such systems.
Among these species, tests might be easiest in the barn owl, because eye drift and rotation are often negligible (Pettigrew and Konishi, 1976; Wagner and Frost, 1994). Thus, it is tempting to assume that the disparity in tuning curves measured by Wagner and Frost (1994) can be identified as absolute disparity. Then, from the position and phase shifts calculated for one of those cells by Zhu and Qian (1996), it follows that this cell indeed exhibited subregion correspondence.FNb Unfortunately, insufficient data were available to reconstruct the binocular RFs of any additional cells from the same session and thus to test whether other cells also showed such correspondence.
Our developmental model may not be directly applicable to macaque monkeys, in which strongly orientation-selective simple cells constitute only a small minority of LGN-recipient layer 4 cells (Blasdel and Fitzpatrick, 1984; Hawken and Parker, 1984). If many simple cells with segregated ON and OFF subregions exist in monkeys, they are more likely formed by combining inputs from other cortical cells than directly from LGN cells. We thus do not expect subregion correspondence to necessarily hold true in the macaque, although the more general prediction of nonrandom phase relationships may still hold. “Tuned inhibitory” cells with their peak inhibition at zero disparity have been reported only in macaque (Poggio and Fischer, 1977). “Tuned near” and “tuned far” cells appear in macaque visual cortex (Poggio et al., 1988) but have not been reported in cat cortex [although the binocular simple cells among the “near” and “far” cells reported by Ferster (1981) might qualify as “tuned”]. Both tuned excitatory and tuned inhibitory cells in macaque appear to be consistently tuned very near zero disparity (Poggio and Fischer, 1977; Poggio and Talbot, 1981; Poggio et al., 1988). Based on these results, it seems possible that binocular simple cells in macaque may come in two varieties: one group showing subregion correspondence, the other group showing anticorrespondence. Most cells in the first group would be tuned excitatory cells with a preferred disparity of zero (assuming that the distribution of position shifts in monkeys is similar to that in cats, after scaling to preferred spatial frequency). The cells in the second group would produce tuned-near and tuned-far cells tuned to disparities of ±0.5/f i, where f iis the cell’s preferred spatial frequency, as well as tuned inhibitory cells with peak inhibitions tightly clustered around zero disparity.
Developmental implications of the relationship between interocular phase shift and preferred orientation
Figure 7 a shows a set of cells for which the distribution of phase shifts depends on preferred orientation. One explanation, under any of the models of binocular RF relationships considered here, is simply to assume that such a dependence exists. For the subregion correspondence model, this would also imply that the distribution of position shifts shows a similar dependence on preferred orientation; i.e., the distribution of horizontal position shifts of vertical-preferring cells would be wider than the distribution of vertical position shifts of horizontal-preferring cells. The developmental mechanisms that generate phase and, for subregion correspondence, position shifts would thus have to differentiate cortical cells based on orientation preferences. We know of no developmental mechanism that could perform this task without visual input, although one can imagine that visual input might allow such a differentiation.
Only the subregion correspondence model allows the relationship of Fig.7 a to occur with a distribution of position shifts that is independent of preferred orientation. All that is required in this case is that the distribution of horizontal position shifts be wider than that of vertical position shifts, as has been observed in mildly peripheral (5–15° eccentricity) cat area 17 (Barlow et al., 1967;Joshua and Bishop, 1970; von der Heydt et al., 1978).
Such an anisotropy might arise during development if, for example, interocular input correlations were significantly narrower with respect to vertical displacements than horizontal displacements, thus forcing a tighter positional agreement of RFs in the vertical direction to optimize correlations. Such anisotropic correlations could occur without visual input: interocular correlations exist in spontaneous LGN activity before the onset of vision (Weliky and Katz, 1999), and it is quite plausible that such spontaneous activity could show an anisotropy between retinotopically horizontal and vertical directions. This may be important, given that some species are born with disparity-selective simple cells (Ramachandran et al., 1977; Chino et al., 1997). If position shifts can develop or refine because of vision after the eyes are open, such asymmetric correlations might be simply accounted for by the smaller vertical than horizontal relative movements of the two eyes.
A possible functional benefit of subregion correspondence
Strong disparity-tuned responses can occur in both simple and complex macaque V1 cells even to stimuli that do not produce depth perception (Cumming and Parker, 1997). This could indicate that depth perception is not the only role played by these cells.
Poggio and Fischer (1977) observed that the cells in the layers of monkey V1 known to project to subcortical structures are almost all of the tuned excitatory type. They reasoned that the output of such cells could be useful in maintaining eye positions to stabilize a target on the fovea. Support for the idea that control of these eye movements relies on responses of cells early in the cortical visual pathway is given by the short latencies of the movements elicited in response to self-motion (Busettini et al., 1996) or in tracking an object moving in depth (Masson et al., 1997), along with the fact that all binocular responses in superior colliculus arise from cortical input (Wickelgren and Sterling, 1969).
Subregion correspondence causes the peak response of the population of tuned excitatory cells to be tightly tuned around zero disparity. This population tuning would mean that any stimulus “will either activate almost all of the tuned excitatory cells or almost none of them” (Ferster, 1981). Eye stabilization, keeping an object on the fixation plane at zero disparity, could then be achieved by maximizing the firing of the tuned excitatory cells in the relevant area. Thus, if tuned excitatory cells are used to control eye stabilization movements, subregion correspondence could increase the precision of such control.
We have proposed that simple cells may develop binocularly matched preferred orientations and spatial frequencies by developing a correspondence of the locations of their ON and OFF receptive field subregions. Here, we have shown that two hypotheses, the hypothesis of subregion correspondence and the hypothesis that the positions of subregions in the two eyes are uncorrelated, are equally consistent with much previous experimental data, but that only the subregion correspondence hypothesis seems consistent with the narrow distribution of preferred disparities of binocular cells in cat areas 17 and 18.
Although this provides significant evidence for this proposal, subregion correspondence cannot be confirmed or denied by the indirect evidence that currently exists. Thus we have described experiments that can (and cannot) provide the necessary data to directly test our hypothesis. Results of such tests, when they become available, will be valuable both in understanding how the adult cortex is organized and in constraining developmental models.
DETERMINING PROBABILITIES FROM MONTE CARLO SIMULATIONS
Given a hypothesized distribution, D, we drawN samples of a given size at random and compute some statistic S on each sample. We find that k of theN samples produces S ≥S 0. Given k and N, we wish to assess the probability of finding S ≥S 0 for samples of the given size from the given distribution, assuming we have no a priori knowledge of this probability. The answer may be well known, but we are not aware of a reference for it and need to use it (see Materials and Methods); so we present it here.
We write the desired probability as P(S ≥S 0‖N, k). For samples of the given size from D, there is some true probability pbetween 0 and 1 of finding S ≥S 0; so we can write Equation 9 P(S ≥ S0‖p) = p, by the definition of p. To find P(p‖N, k), use Bayes’ rule to write: Equation 10Because we have no a priori knowledge of p,P(p) = P(p′) is constant, independent ofp or p′; so these terms in the numerator and denominator cancel, leaving: Equation 11This equation could perhaps have been written down directly; it just says that the probability that the actual probability isp is the proportion, out of all the ways we could get (N,k) for any p′, represented by the ways we could get (N,k) with p.
The numerator of Equation 11 is given by the binomial distribution:P(N, k‖p) = (k N)pk (1 − p)N–k. Thus, Equation 9 becomes: Equation 12 Equation 13where the definition of the beta function, B(z, w) = ∫0 1 dt tz−1(1 − t)w−1, is used in the last step. Finally, notingB(z, w) = , Γ(z + 1) = zΓ(z) (Abramowitz and Stegun, 1964), this result reduces to: Equation 14
This work was supported by National Institutes of Health Grants NS07067 and EY11001-01 and by grants from the Searle Scholars’ Program, the Alfred P. Sloan Foundation, and the Lucille P. Markey Charitable Trust. We gratefully acknowledge useful conversations with M. Stryker, R. Freeman, I. Ohzawa, and F. A. Miles and helpful comments on this manuscript from T. Troyer and A. Kayser.
Correspondence should be addressed to Kenneth Miller, Department of Physiology, University of California, San Francisco, CA 94143-0444.
↵FNa That this diagonal band is improbable under the uncorrelated model can be seen simply by considering the distribution of points in the four central squares. Fourteen of 19 points fall in the top right or bottom left squares, the two favored by subregion correspondence. The probability of 14 of 19 points randomly falling in these two squares out of four, assuming equal probabilities for the four, is 0.0222.
↵FNb This cell had a preferred orientation θ = ±30° from vertical (sign not specified) and a horizontal component of spatial frequency of 0.5 cycles/deg. Zhu and Qian (1996)determined the phase shift to be Δφ = φR − φL = −π/2 and the position shift to be ΔH = 1.5°, assuming ΔV = 0. This can also be expressed as spatial frequency f = 0.58 cycles/deg with position shift Δx = 1.3°. The experimental data do not constrain Δy and φL. Note that this cell obeys Equation 8.