Abstract
Weakly electric fish use an electric sense to navigate and capture prey in the dark. Objects in the surroundings of the fish produce distortions in their self-generated electric field; these distortions form a two-dimensional Gaussian-like electric image on the skin surface. To determine the distance of an object, the peak amplitude and width of its electric image must be estimated. These sensory features are encoded by a neuronal population in the early stages of the electrosensory pathway, but are not represented with classic bell-shaped neuronal tuning curves. In contrast, bell-shaped tuning curves do characterize the neuronal responses to the location of the electric image on the body surface, such that parallel two-dimensional maps of this feature are formed. In the case of such two-dimensional maps, theoretical results suggest that the width of neural tuning should have no effect on the accuracy of a population code. Here we show that although the spatial scale of the electrosensory maps does not affect the accuracy of encoding the body surface location of the electric image, maps with narrower tuning are better for estimating image width and those with wider tuning are better for estimating image amplitude. We quantitatively evaluate a two-step algorithm for distance perception involving the sequential estimation of peak amplitude and width of the electric image. This algorithm is best implemented by two neural maps with different tuning widths. These results suggest that multiple maps of sensory features may be specialized with different tuning widths, for encoding additional sensory features that are not explicitly mapped.
In many sensory systems, neurons in the early processing stages are tuned to a specific two-dimensional (2D) location of a stimulus. In the visual system, this corresponds to the 2D projection of the visual world onto the retina; in the somatosensory system, this is the location of a touch on the skin. The neurons in these systems respond maximally for one location, with their activity decreasing for locations away from this preferred location; hence the neural responses are described by 2D bell-shaped tuning curves. Typically, these neurons have preferred locations distributed over a wide space such that a neural map of stimulus location is formed (Konishi, 1986; Knudsen et al., 1987). This is often referred to as a coarse code for stimulus location (Churchland and Sejnowski, 1992). Populations of neurons can also carry information about sensory features to which its component neurons are not explicitly tuned in this manner. In somatosensory processing, the 2D location of a skin probe is coarse-coded by peripheral mechanosensory neurons; yet humans cannot only determine the location of the probe, but can also accurately determine its shape, a feature that is not encoded with bell-shaped tuning curves (Wheat et al., 1995; Khalsa et al., 1998). We refer to such population codes, which involve multiple coding strategies, as combined codes.
Weakly electric fish must use a combined population code during electrosensory processing. These fish can accurately determine the locations of objects in their surroundings using an active electric sense, a behavior called electrolocation (Heiligenberg, 1991; von der Emde et al., 1998). Objects with electrical properties that differ from those of the ambient water produce distortions in the fish's self-generated electric field. On the body surface, these distortions form a 2D electric image (Fig. 1) and provide the sensory input required to accurately encode object location in 3D (Rasnow, 1996; von der Emde et al., 1998). The electric image is initially encoded in the activity of skin electroreceptors. These receptors contact primary afferents that project somatotopically to the hindbrain and terminate in parallel on four maps in the electrosensory lateral line lobe (ELL). Each map is different in size and comprises pyramidal neurons with distinct physiological properties, including tuning curve width (Shumway, 1989a,b; Metzner, 1999; Turner and Maler, 1999). The necessary features of the electric image must be encoded in these 2D arrays of ELL pyramidal neurons. Object location in the 2D body plane is coarse-coded. Object distance (i.e., the third dimension) must be estimated indirectly from population activity related to the width and peak amplitude of the electric image (Rasnow, 1996; Assad et al., 1999).
Computation of object distance in the electrosensory system. a, A schematic of the two-dimensional electric image on the surface of the fish for objects of two different sizes and lateral distances. Although the widths of the images are different, the peak amplitudes are the same (measured in grayscale, with white being the largest). Thus, detecting object distance based only on amplitude leads to ambiguities. b, One-dimensional slices of the electric images caused by conducting spheres of different sizes (ro = 0.5 cm andro = 1.0 cm) and different lateral distances (z* = 1.0 cm and z* = 1.26 cm). The schematic (top right, fish not to scale) illustrates the combinations of ro andz* (by line type, size, and location) that relate to the graph below.
Theoretical results suggest that the accuracy of a 2D coarse code should be unaffected by the width, and overlap, of the tuning curves (Snippe and Koenderink, 1992; Abbott and Dayan, 1999; Zhang and Sejnowski, 1999). Nonetheless, multiple parallel maps, exhibiting neuronal tuning with different widths and extents of overlap, are universal in sensory systems, even when they do not exist at the sensory periphery (Konishi, 1986). Different maps may provide multiple samples of information that can be averaged by downstream networks for higher accuracy. Alternatively, the different maps may be optimized to encode additional stimulus features using other strategies. The different ELL maps appear to be specialized; in some situations, information from each map is used to produce distinct behaviors (Metzner and Juranek, 1997). Here, we use theoretical analyses and modeling to investigate the influence of ELL pyramidal neuron tuning width in 2D on the accuracy of encoding object location in 3D. In doing so, we suggest that the different ELL maps may be specialized for encoding the different stimulus features used for computing object distance.
MATERIALS AND METHODS
A model description of the electric image. The electric image caused by a spherical conductor is well approximated by a Gaussian-shaped surface with width and peak amplitude given by 2θ and Ao
. Using data and simulations from a previous study (Rasnow, 1996), we have developed a parametric model of the electric image that enables our present analysis. We describe the electric image produced by a sphere of radius,ro
, at a location (x*,y*, z*) by the function S (Eqs.1-3):
Equation 1The half-width of the image (θ) increases linearly with lateral distance (Eq. 2) in the range of available data (ro
= [0.125, 0.7]; z* = [1.0, 2.0]; θ, ro
, x*,y*, z* have units in centimeters). The peak amplitude of the image, Ao
(units in millivolts), decreases as the third power of lateral distance, and although it is actually proportional to the volume of the sphere (Rasnow, 1996), in the range considered,Ao
is approximately linear withro
(Eq. 3):
Equation 2
Equation 3With c1
= (−0.055) andc2
= (0.79), Equations 2 and 3 provide a good description of the data (χ2 < 10−4). This model is not meant to be a detailed reproduction of the electric image, but rather a simple description that allows us to gain insight into the nature of the electrosensory information available for electrolocation. The exact parameter choices do not affect our general conclusions.
A model of the ELL network. To describe the response of the population of ELL pyramidal neurons, we convolve the stimulus,S, with 2D Gaussian-shaped tuning curves of width, ς, and where (xi,yj
) is the tuning curve (or receptive field) center of the neuron labeled ij. Because we have assumed that the electric image is also Gaussian, the convolution and hence the response of the pyramidal neuron population is given by Equation 4 (after rescaling to obtain a physiologically appropriate spike count for a 1 sec time window, and accounting for a baseline activity level, go
= 100,Ebaseline
= 20) (Bastian, 1986b). We include additive noise, Enoise
, which has a normal distribution with zero mean and SD, η. The response of the neuron ij, Eij
, is described in Equation 5:
Equation 4
Equation 5The network we consider consists of an N ×N square grid of pyramidal neurons (i,j = 1,… ,N) with their locations on the grid defining the centers of their evenly spaced tuning curve centers (xi, yj
). Although we allow the grid size N and the grid dimensions (x, y) to vary, we specify the grid spacing (xi + 1 − xi = yj + 1 − yj
= Δ = 0.15 cm) so that the density of tuning curve centers (ρ = 46.7) is in the physiological range of 40–50 neurons/cm2[expressed in relation to body surface area (Shumway, 1989a,b) (J. Lewis and L. Maler, unpublished observations)]. The center of the grid is the origin, (x, y) = (0, 0). Although receptive field sizes of ELL pyramidal neurons have been reported previously (Bastian, 1981; Shumway, 1989a), the methods used (different combinations of object size and direct electrical stimulation) make it difficult to directly obtain values of ς. However, estimates for the physiological range of ς are between ∼0.3 and 0.7 cm depending on the particular ELL map (the centromedial map has the narrowest, and the lateral map has the widest tuning curves).
For simulations of this network, we calculate a neuron response profile using Equation 4 for a given set of object features. Gaussian random numbers (Enoise ) with zero mean and SD of η are generated (Press et al., 1993) for each neuron and added to the response profile Eij . These responses are then rounded to the nearest integer value to give the single trial response of the population in terms of spike count. A typical single trial response is shown in Figure2c. For the open symbols plotted in Figure 3, we estimate the image features from this noisy profile using a least-squares fit to Equation 4 with the free parameters being eitherro , x*, y*,z* (Fig. 3a) or θ,Ao , x*, y* (Fig.3b). This is equivalent to a maximum-likelihood (ML) estimate of the free parameters (Kay, 1993; Deneve et al., 1999). The estimation error over a number of trials is given by the mean-squared difference between the estimated and true values of each parameter (equivalent, in this case, to the variance of the estimated values). For all of the results shown, we use additive noise (Eq. 5) with η = 7 in agreement with preliminary data (J. Bastian, J. Lewis, and L. Maler, unpublished data); however, the exact value of η does not affect our conclusions.
A model of the ELL network response.a, The electric image for an object of radius (ro = 0.5 cm) at a location (x*, y*, z*) = (0, 0, 1) is shown on a spatial grid. Image amplitude is in grayscale (white = 0.3 mV; black = 0 mV).b, The 41 × 41 neuronal grid with the tuning curve size of one neuron denoted by the gray shaded circle(ς = 0.6). The position of each neuron on the grid is given by its tuning curve center (xi ,yj ) in register with the image ina. The neuronal density is ρ = 46.7 neurons/cm2. c, A typical realization of the neural response produced by the image in a is shown in grayscale (white = 65 Hz;black = 0 Hz). Other parameter values are:Ebaseline = 20;go = 100; η=7. d, The broadening of the average neuronal response (plotted vs the tuning curve centers xi , open circles) compared with the electric image (solid line) illustrated in one-dimension for the above parameters. The half-widths of the image and response profile are θ and √(θ2 + ς2), respectively.
Accuracy of estimating electric image features. a, The error in estimating the size (ro) and (x*,y*, z*) location of a conducting sphere as a function of pyramidal neuron tuning width, ς. b,The error in estimating the corresponding image featuresAo and θ as a function of tuning width, ς. In both panels, the continuous lines indicate the analytically computed error given by the minimum variance of the estimate (Cramer-Rao lower bound, see Results). The open squares and open circles [forro (a) andz* (b)Ao and θ, respectively] show the errors from network simulations (5000 trials each point; 41 × 41 neuronal grid; ρ = 46.7 neurons/cm2;Ebaseline = 20;go = 100; η = 7; see Results). For the theoretical calculations, a larger grid (101 × 101) was used (with the neuronal density preserved) to avoid edge effects for the larger tuning widths. For the featuresro,Ao, and z*, the error is normalized to the true values of the feature. The true values arero = 0.5,Ao = 0.289, θ = 1.00, and (x*, y*, z*) = (0, 0, 1.2).
We consider two different network implementations of a two-step algorithm for determining θ and Ao ; one in which the same network is used to estimate both stimulus features (model 1) and the other consisting of two networks (model 2), with each used to estimate a single feature (see Results) (see Fig. 5). Our initial comparison involves specific, previously proposed (Assad et al., 1999), mechanisms to implement the algorithm, but to also compare these models in a general decoding framework we used a variation of the ML method described earlier (see Results). In this case, two networks of the same size (41 × 41 grid) were used. The first network was used to estimate Ao by using a least-squares fit to the noisy neuronal profile withAo and θ as free parameters. A similar procedure was then performed on the second network but with only θ as a free parameter, with Ao fixed to the value estimated by the first network. This can be viewed as an optimal implementation of the two-step algorithm.
The Cramer-Rao lower bound and Fisher information.Estimation of object size and location in the present context is formulated as the estimation of a vector parameter, ϕ = (ϕ1,ϕ2,ϕ3,ϕ4). In Figure 3a, (ϕ1,ϕ2,ϕ3,ϕ4) corresponds to (ro
, x*,y*, z*), whereas in Figure 3b(ϕ1,ϕ2,ϕ3,ϕ4) equals (θ, Ao
, x*,y*). The accuracy of an estimator can be assessed by its bias and variance. An estimator is considered unbiased if its average value is equal to the true value of the estimated parameter. The variance of an unbiased estimator is equivalent to the mean-squared estimation error; the lower the variance the more accurate the estimator. The theoretical lower limit on the variance of any unbiased estimator is given by the Cramer-Rao lower bound (Kay, 1993). The Cramer-Rao bound is the reciprocal of the Fisher information,IF
(Eqs. 6, 7). The more accurate an estimator is, the more information it provides about the parameter that is estimated; this information is quantified byIF
:
Equation 6
Equation 7In Equations 6 and 7, ϕest is the estimate and ϕ is the true value of the vector parameter, η is the SD of Enoise
,N2 is the number of neurons in the population, k = (1,… ,4), andm = (1,… ,4) for each of the four parameters. Thus, when four parameters are estimated simultaneously,IF
is a 4 × 4 matrix. The Fisher information has previously been used to measure the accuracy of neuronal population codes (Abbott and Dayan, 1999; Deneve et al., 1999;Zhang and Sejnowski, 1999). Assuming (x* = 0, y*= 0) the Fisher information for the parameter θ alone can be rewritten in terms of the grid spacing, Δ (Eq. 8):
Equation 8In situations in which multiple but similar neuronal populations are involved in estimation (e.g., multiple maps), the Cramer-Rao bound can be calculated from the Cramer-Rao bound for the individual networks. If θ1
andθ2
are estimates from the two different networks, and the com- bined estimate isθ1–2
, then the variance ofθ1–2
can be described by Equation9 (Rosner, 1995):
Equation 9
In the case of two identical networks (i.e., same size and same tuning widths, etc), taking the average of the two independent estimates is optimal; in this case k1
=k2
= 0.5, and because var(θ1
) = var(θ2
), the net Cramer-Rao bound is exactly half that for the individual networks. To similarly evaluate the combinations of networks with different properties (as in Fig. 7), a weighted average is best, so we choose the constantsk1
and k2
to be the reciprocals of the single network variances [k1
= 1/var(θ1
) andk2
= 1/var(θ2
)]. A similar procedure was used for Ao
estimates as well.
RESULTS
Estimating object distance
From a 2D electric image on their body surface, electric fish are able to determine the 3D location (x*, y*,z*) of the object producing the image (von der Emde et al., 1998). The object location in the body plane (x–y plane) can be estimated from the location at which the image has its peak amplitude. However, the peak amplitude of the electric image provides ambiguous information about the third dimension, lateral distance away from the fish (z*). In Figure 1a, two spherical objects of different sizes (and otherwise identical) are located at the same (x*, y*) location, but the larger object is farther away. For this, and many other combinations of object size and lateral distance, the peak amplitude of the image is the same and thus cannot be used to unambiguously determine the lateral distance of each object (see Materials and Methods; Eq. 3) (Rasnow, 1996). The image produced by the larger object is wider than the other (Fig.1a). This is shown more clearly by a one-dimensional slice through the image (Fig. 1b). When the image is normalized to its peak amplitude, its width can then be used to estimate lateral distance, z* (Rasnow, 1996; Assad et al., 1999).
To enable our analyses, we used a simplified description of the electric image. We assume the electric image has a 2D Gaussian shape, with its peak amplitude and half-width given by the parametersAo and θ (see Materials and Methods) (Eqs. 1-3). Because θ provides a measure of normalized width and varies linearly with lateral distance, z* (Rasnow, 1996), it then can be used to estimate lateral distance (we use θ and image width interchangeably, although θ is actually the half-width).
Another image feature proposed as an indicator of object distance is the maximum slope of the image normalized to its peak amplitude (von der Emde et al., 1998; von der Emde, 1999). For a Gaussian image, this quantity varies as 1/θ and also fits the published maximum slope data (von der Emde et al., 1998) very well (Lewis and Maler, unpublished observations). Because of the direct relationship between θ, maximum slope, and previously reported data, we have discussed our results in terms of θ alone.
Estimation accuracy and tuning curve width
In the present context, downstream electrosensory networks must extract information about object location given a noisy profile of activity in the ELL pyramidal neuron population. We have formulated a simple model of the ELL population response to a stereotyped electric image (see Materials and Methods). Figure 2a shows the electric image produced by a small sphere (Eq. 1), which provides the input to the 2D grid of model neurons that constitute the ELL network (Fig. 2b). Each neuron on the grid integrates input from the electric image over a restricted range or receptive field (shown schematically by the shaded region in Fig. 2b; Eq. 4), such that for a point stimulus each neuron has a 2D Gaussian-shaped tuning curve (in the x–y plane). The tuning width is given by 2ς (measured at a height corresponding toe−1/2 of the tuning curve peak). We ignore any contributions that dynamics may provide, with the response of each neuron given by a spike count over an integration time of 1 sec. After the addition of noise the ELL population response profile resembles a noisy replication of the electric image (Fig. 2c). Because the electric image is not a point stimulus, the actual response profile of the ELL population is wider than the image, to an extent that depends on the relative values of θ and ς (Fig. 2d) (see next section).
Given the noisy response profile of the ELL population, the typical population decoding problem is to determine the features of the object (x*, y*, z*,ro ) that produced the response (Abbott, 1994; Salinas and Abbott, 1994; Deneve et al., 1999; Zhang and Sejnowski, 1999). As discussed before in a functional context, to unambiguously determine z* andro , two features of the electric image produced by the object must be estimated: the image width and the amplitude of the image peak (θ and Ao , respectively). The accuracy of estimation is limited by the accuracy with which the ELL neurons jointly encode these different features. Using a common approach from statistical estimation theory (Kay, 1993), we can determine an upper limit on this accuracy by computing the Cramer-Rao lower bound for estimating each object featurex*, y*, z*, andro , as well as the image features θ andAo (Eq. 6) (see Materials and Methods). Accuracy in this context is given by the mean-squared estimation error, or equivalently, the variance of the estimate. We investigated the influence of two parameters on estimation accuracy: the lateral distance of the object z* and the tuning curve width ς. The error bound for estimating all features increases withz* (data not shown). This is not surprising because the image amplitude (and thus the effective signal-to-noise ratio) decreases fairly quickly with distance (Eq. 3). More interestingly, the effects of changing ς differ between the features (Fig. 3). There is no effect on estimating the x*–y* location (Fig.3a); the same result has been found previously for point stimuli (Snippe and Koenderink, 1992; Abbott and Dayan, 1999; Zhang and Sejnowski, 1999). On the other hand, increasing tuning width ς results in worse estimation of ro and z* (Fig. 3a). For estimating electric image features (Fig.3b), increasing ς results in worse estimation of θ (larger error), but better estimation ofAo (smaller error). Intuitively, this makes sense, wider tuning curves allow more neurons to accurately contribute to the estimation of Ao , and by averaging across neurons, a better estimate results. Estimating image width is different although, because the ELL neurons distort the image through a convolution with their tuning curves (Eq. 4, Fig.2d). This distortion increases with tuning width, resulting in more neurons that do not accurately represent image width, nonetheless influencing the θ estimate.
Shown also in Figure 3 are the results of network simulations. Using a network grid consisting of a physiological number and density of neurons (41 × 41 neuronal grid, density ρ = 46.7 neurons/cm2), we estimated the image features from the noisy neural responses using an ML approach (see Materials and Methods). The estimation error for this method is very close to the corresponding lower bound (Fig. 3, compare open symbols with solid lines). Note that for larger ς however, there is a slight deviation from the theoretical bound attributable mainly to edge effects (i.e., the neuronal image has above baseline values beyond the limits of the grid edges).
The relationship between the accuracy of image width estimation and tuning width can be made explicit by expressing the Fisher information for θ, IF (θ), in terms of the grid spacing, Δ, the distance between tuning curve centers (Eq. 8). Differentiating IF (θ) with respect to ς reveals that IF (θ) decreases with ς (i.e., the derivative is negative and thus the estimation error increases) as long as θ2 + ς2 > (Δ/2)2. This condition will hold as long as the tuning curve width is greater than the grid spacing (i.e., if 2ς > Δ). A similar calculation shows that the Fisher information for Ao ,IF (Ao ), increases with ς for all ς > 0.
A simple neural algorithm for determining object distance
Estimating θ, in the present context, is equivalent to estimating the half-width of the image at a level of Aoe−1/2. To provide an unambiguous estimate of z*, a measure of image width must be calculated from an image normalized by Ao . So in such a practical situation, peak amplitudeAo must be estimated first, before image width.
One simple algorithm to calculate image width is to first normalize the neural responses to the maximal response and then count the number of neurons that are active above a certain threshold (Assad et al., 1999). One way to formalize this two-step algorithm is to first compute the average activity Eave of all the neurons firing above a threshold, φa (Fig.4a). This step (step 1) provides an estimate of the peak response in the population, which can be used to normalize all neural activity. Then in step 2, the fraction of neurons (Nw) firing above a different threshold (φw) can be determined (Fig. 4b). These two thresholds are distinct in that φa is fixed and not relative to any neural response, whereas φw comes after the normalization step and is relative to the maximum response in the network. Figure 4, c and d shows how these measures vary with the features they are supposed to estimate. The actual peak neural activity (goAo ) differs fromAo by a constant factor and thus varies withz* in parallel with Ao (Fig.4c). However, Eave underestimatesgoAo but still varies linearly withAo (Fig. 4c, inset). Similarly,Nw varies in a near linear manner with θ (Fig. 4d).
Two-step algorithm for estimating electric image width. a and b show representative response profiles Ei of a one-dimensional slice through the neuronal grid. In athe peak amplitude of the image is estimated by the average activityEave of all neurons firing above a threshold level of Ebaseline+φa(Ebaseline = 20). In bthe response profile is normalized byEave , and the image width is estimated by the number of neurons Nw firing above a threshold level of (Ebaseline/Eave )+φw.c, The average spike count of the neurons above threshold (Eave ), as well as the decay in actual peak activity (goAo ;go = 100) with increasing object distancez* (φa = 2η = 14; ς = 0.6). Over this range of z*,Eave varies linearly withAo (inset).d, The fraction of total neurons activated above a threshold level of φw = e−1/2 (i.e., the number of neurons with a preferred location within a radius θ of the object location) plotted versus θ. For different values of ς, this measure increases in an almost linear manner with θ. The solid lines are the theoretical curves derived for a continuous distribution of neurons (see Results), and the open symbols show the measure for an actual model network (41 × 41 neuronal grid; ρ = 46.7).
We consider two specific neural implementations of this algorithm (Fig.5). Model 1 uses the same map (i.e., network), with tuning width ς1, for estimating both θ and Ao. Model 2 uses two maps, one with a relatively large tuning width (ς2 = 1) for estimating Ao and another with narrower tuning widths (ς1 ≤ 1) for estimating θ. Model 1 is analogous to a single sensory map for all computations, and Model 2 is analogous to having two specialized sensory maps, one for estimating peak amplitude Ao , with larger tuning widths, and the other for estimating width θ, with smaller tuning widths. It is critical to note that both models use the same number of neurons in each processing step (each map is a 41 × 41 neuronal grid). The critical difference is that model 2 has two different tuning widths for each processing step.
Schematic representation of two models for implementing the two-step algorithm for estimating electric image width. In model 1 (left), bothAo and θ are estimated using the same map, map 1 with tuning widths ς1. In model 2 (right), separate maps are used for each estimation step: map 2, with wide tuning curves (ς2 = 1), is used to obtain an estimate of Ao , which is then used to normalize the activity in map 1, with narrow tuning curves (ς1 < ς2), from which an estimate of θ is obtained. BecauseAo and θ are estimated separately in this two-step algorithm, both models use the same number of neurons to estimate each feature, although model 2 has two maps, and model 1 has only one.
To compare the performance of these models, we computeNw for many simulated presentations of an object over a range of values of ς1 andz* (φa = 2η; φw = e−1/2). In this situation, the true value of Nw is given by the number of neurons with preferred locations within a circle of radius θ (πθ2ρ). The estimate of Nw from the present neural algorithm, however, is biased. This is in part because of its dependence on Eave and also because it is determined from a neural profile that has an effective width of √(θ2+ς2) caused by the tuning curve convolution (Eq. 4, Fig. 2d). Because in the context of these models, downstream networks would have to use Nw to estimate object distance, and Nw is directly related to θ and object distance (Fig. 4d), we evaluate model performance from the bias and variance in Nw. In all cases tested (ς1 = 0.15–1.0; z*= 1.0–1.4), model 2 outperforms model 1. For z* = 1.2, the estimation variance for both models is shown in Figure6a. The biases inNw estimation are nearly identical for both models (data not shown), but the variance for model 2 is substantially less than that for model 1. Because model 2 is better at estimating peak amplitude (by virtue of its wide tuning curves for this step, ς2 = 1), the variance is dominated by the width estimator, and thus increases with ς1 in the same manner as the Cramer-Rao bound for θ (Fig. 3b). Model 1 must use a network with the same ς1 for all steps, so there is a trade-off between accuracy of peak amplitude estimation and accuracy of width estimation. Peak amplitude estimation is better for larger ς1 when more neurons are activated close to peak levels, but width estimation is better for smaller ς1. In the case shown (Fig.6a), the amplitude estimate dominates even for small ς1 and thus the overall variance decreases with ς1, similar to the Cramer-Rao bound forAo (Fig. 3b). When ς1 = 1, both models have similar overall accuracy. Although it would seem that model 2 effectively has twice as many neurons as model 1, as stated earlier, it really uses the same number of neurons as model 1 for each processing step. The slightly better performance of model 2 for ς1 = 1 is caused by the independence of the responses between the different maps. In other words, if the noise in the neuronal responses was exactly correlated between the two maps of model 2, the accuracy would be identical to that of model 1.
Performance of the two models in implementing the two-step algorithm for estimating electric image width. a, This panel shows the variance in the estimate of Nw for each model. Model 1 (open symbols) results in higher variance than model 2 (ς2 = 1; closed symbols) for all tuning widths ς1. Nw is the fraction of neurons above threshold (i.e., the actual number normalized by the total number of neurons N2; parameter values are N = 41, φw= e−1/2, φa = 2η, ro = 0.5, (x*, y*, z*) = (0, 0, 1.2), ρ = 46.7,Ebaseline = 20,go = 100, η = 7. The true value of Nw = πθ2ρ/N2 ∼0.09.b, The variance of the θ estimate for a generalized decoding scheme in both models (see Results). All parameter values are the same in a and b, except that in b two independent maps are used for both models (see Results). Note that when ς1= 1 in this case, both models are identical so the variances are necessarily the same. Each point in a andb represents the variance calculated from 3000 simulated trials.
The general trends shown in Figure 6a are similar for neuronal densities within ∼50–150% of that used in the simulations shown. We also tested several combinations of values for φa (range, 2η–4η) and φw (range, 0.25–0.75) for z* = 1.2 and ς1 = 0.3. Similar trends resulted, so the increased accuracy of model 2 over model 1 does not depend critically on these threshold values. In addition, we also considered conditions in which the noise term Enoise was such that the SD of the ELL neuron responses was equal to their mean response Fij (Eq. 4), rather than constant (η = 7) and independent of Fij . This type of noise resulted in similar results (data not shown) and does not change our conclusions.
Although the previous analysis demonstrates a clear difference between the two models, it is important to prove that this difference is fundamental and is not simply attributable to the details of the algorithm implementation or the fact that model 2 uses two independent networks. We now consider two independent maps composed of 41 × 41 neuronal grids with tuning widths of ς1 and ς2, respectively. Map 2 provides an estimate ofAo using ML estimation; this estimate is used to normalize the activity in map 1. Then map 1 is used to find the ML estimate of θ. For model 1, both maps have the same tuning width ς1 = ς2. Model 2 is identical to model 1 except that map 2 has a fixed tuning width ς2 = 1. This constitutes a test of the two models in a general decoding framework in which the only difference is in the tuning width of map 2. The results are similar to those previous, with model 2 providing a better estimate of θ (Fig.6b). In this case however, the error increases with ς1 for both model 1 and 2, in the same manner as the Cramer-Rao bound for θ (Fig. 3b), suggesting that θ estimation dominates the overall estimation error. This provides a theoretical validation of our conclusions, but it is certainly not an option for the fish. The ELL does not have multiple maps with the same tuning widths, and thus the fish does not have access to identical information from two identical maps. Our initial analysis (Fig.6a) shows how the specialized use of an additional map can improve the computation performed by a single map.
The two-step algorithm we have considered is based on previous ideas (Assad et al., 1999) and practical constraints (i.e., peak amplitude must be estimated before normalization can occur). But it is also interesting to ask how maps can be combined in the context of optimal estimation as defined by the Cramer-Rao bound. We again consider the two-map configuration analyzed in Figure 6b. We calculated the Cramer-Rao bounds for θ andAo for two maps (see Materials and Methods) and compared them to that for a single map (Fig.7a,b). Two maps with identical tuning widths are twice as good as one map with that tuning width (i.e., the error decreases by half for two maps). The neuronal density is a critical factor in determining population coding accuracy (Zhang and Sejnowski, 1999). Having two identical maps is the same, in terms of accuracy, as having a single map with twice the density, not necessarily twice the number of neurons. Also shown in Figure 7is that having one of the maps with a fixed tuning width (ς2 = 1) is better than two identical maps for estimating Ao , but worse for estimating θ. Thus, the relative importance of these parameters will influence the optimal configuration of the two maps; if a premium is placed on estimating θ independently ofAo , then narrow tuning in both maps is better. In the two-step algorithm considered in this paper, accurately estimating Ao is critical for the overall accuracy of estimating θ, so a combination of tuning widths is best.
Performance of two maps in the context of the Cramer-Rao bound. For two different combinations of two maps (similar to those in Fig. 6b), the analytically calculated Cramer-Rao bounds for estimating Ao(a), and θ (b) (see Materials and Methods), are shown by the thick solid anddotted lines, respectively. In one configuration (ς2 = ς1), both maps have the same tuning width, and in the other configuration (ς2 = 1), one map has a tuning width of ς1, and the other is fixed at ς2 = 1. Also shown (thin solid lines) are the analytically calculated Cramer-Rao bounds for a single map (same as those in Fig. 3b).
DISCUSSION
Multiple maps and population coding
Weakly electric fish can accurately electrolocate objects in their surroundings using sensory information contained in a 2D electric image (Bastian, 1987; Heiligenberg, 1991; Nelson and MacIver, 1999; von der Emde, 1999). To unambiguously extract 3D object location, the fish must compute the width and location of the peak of the electric image that is normalized to its peak amplitude (Rasnow, 1996). The electric image is initially encoded in four populations of pyramidal neurons that comprise the four parallel maps in ELL. One map (ampullary system) is specialized for low-frequency signals. The neurons within the three remaining maps (tuberous system) can be distinguished, among other characteristics, by their distinct spatial response properties: the lateral map with large receptive fields, centromedial map with small receptive fields, and the centrolateral map with intermediate-sized receptive fields (Shumway, 1989a,b; Metzner, 1999; Turner and Maler, 1999). Our results suggest a novel function for the parallel sensory maps in ELL, as well as the occurrence of parallel maps in other sensory systems. Namely, in addition to coarse coding stimulus features on different scales, parallel sensory maps may also be optimized to encode features of a sensory stimulus to which the component neurons are not tuned in the same manner. Specifically, in addition to encoding the 2D electric image at different spatial scales, different ELL maps can also be specialized to accurately represent the sensory features required to compute the third dimension, i.e., object distance.
Previous theoretical studies have found that neuronal tuning width (or spatial resolution) should not affect encoding accuracy in 2D (Snippe and Koenderink, 1992; Abbott and Dayan, 1999; Zhang and Sejnowski, 1999). This is also the case for the coarse-coded features of a spatially extended stimulus (i.e., the electric image) (Fig.3a). This result does not apply when multiple 2D stimuli are given simultaneously, as in two-point discrimination, where narrower tuning curves are better (Snippe and Koenderink, 1992). We show that depending on the encoding strategy for a particular stimulus feature, either wider or narrower tuning curves improve encoding accuracy. We illustrate the impact of tuning width on the accuracy of determining object distance from the electric image using two simple models. To encode the peak amplitude of the electric image, wider tuning curves in the two coarse-coded dimensions (x and y) result in higher accuracy; whereas, to encode image width, narrower tuning curves are more accurate. This suggests that the lateral map in ELL (in addition to its other functions, such as processing high-frequency signals like chirps) (Shumway, 1989a; Metzner and Juranek, 1997), may provide information about image amplitude that can be used to normalize the activity in the centromedial map, which is then used to compute image width and object distance. Normalization could be mediated by the extensive cerebellar-like feedback that projects to ELL, through shunting inhibition or synaptic depression (Maler and Mugnaini, 1994;Bastian, 1996; Berman and Maler, 1999). This simple hypothesis can be readily tested with established experimental techniques (Bastian, 1987;Metzner and Juranek, 1997; Nelson and MacIver, 1999). For example, ablating the lateral map of ELL (wide tuning) should disrupt the accurate estimation of image amplitude and the subsequent normalization step, resulting in an ambiguous estimation of object distance. So, predictable behavioral errors should occur when animals attempt to distinguish objects with certain combinations of size and distance.
The locus of computation of object distance is not known, and it need not be the centromedial map itself, because information from all ELL maps could be combined in higher brain regions (i.e., torus semicircularis or optic tectum) (Heiligenberg, 1991). Indeed, it is not necessary that there be a locus of computation, or explicit neural map, of object distance in electric fish. Such information could remain in a combined population code throughout its processing stream. However, there is some evidence of neurons in both the tectum and cerebellum that are tuned to object distance (Bastian, 1986a). Similar “distance-tuned” neurons exist in the optic tectum of frogs and toads (House, 1989). There is also evidence that information from the different maps is treated very differently in the torus (Metzner and Juranek, 1997). So apparently the same information from different maps is not simply being averaged. Other constraints could lead to the formation of differently sized maps, such as specialized roles in temporal processing and communication (Metzner, 1999), as well as those proposed in the present paper.
The present analyses have primarily considered the location of an object. However, with an estimate of object distance (z*), the approximate size of the object (ro ) can then be decoded from the amplitude estimate (Eq. 3). The accuracy of this estimate will be constrained by the Cramer-Rao bound shown in Figure 3a. Thus, extensive cross-talk between ELL maps (either within ELL or in their projections to higher centers) may be required to identify the complete array of necessary object properties (Assad et al., 1999).
Combined strategies in population coding
The population coding literature has primarily dealt with how neuronal populations encode features to which its component neurons exhibit bell-shaped tuning curves. These studies often focus on how a single value of the feature in question can be extracted from the neuronal population response. There can be more information in the population response than that one value; for example, the entire probability distribution of a stimulus feature can be decoded from the population response (Zemel et al., 1998). There is recent evidence, in the case of visual motion perception, that such information is actually used to form a specific percept (Treue et al. 2000). This information is still related to the coarse-coded stimulus features. To our knowledge, extracting information from a combined population code in a functional context, has not been previously considered.
Cues for electrosensory depth perception
Our study of electrosensory depth perception has considered only static cues of object distance. In the context of visual processing, the problem is analogous to judging the depth of a stationary object using only monocular information. Electric images resulting from near objects are narrower and of greater peak amplitude than those of far objects, and thus can be considered as having less blur and higher contrast. Blur and contrast can have a significant influence on visual depth perception and are commonly used by artists in the pictorial depiction of depth (O'Shea et al., 1994; Mather, 1997). In normal visual processing, such cues are usually effective only in the absence of others such as those resulting from stereopsis and motion. Although there is no binocular analog in electrosensory processing, electric fish certainly have many motion cues available. Indeed, some species of electric fish exhibit a back-and-forth hovering motion that could be used to generate specific cues. Also, looming cues, such as those resulting from a changing electric image as an object approaches, could also be used for computing a parameter such as the time-to-collision, often discussed in the context of visual looming (Sun and Frost, 1998;Gabbiani et al., 1999; Rind and Simmons, 1999). As yet, there is little known about electrosensory motion processing and how electric fish might use such information for electrolocation.
Footnotes
This study was supported by the Canadian Institutes of Health Research through an operating grant to L.M. and a postdoctoral fellowship to J.E.L. Thanks to T. Lewis and S. Kealey for helpful comments on this manuscript.
Correspondence should be addressed to John E. Lewis, Department of Cellular and Molecular Medicine, University of Ottawa, 451 Smyth Road, Ottawa, Ontario, Canada K1H 8M5. E-mail: jlewis{at}uottawa.ca.