Abstract
Sensory neurons have been hypothesized to efficiently encode signals from the natural environment subject to resource constraints. The predictions of this efficient coding hypothesis regarding the spatial filtering properties of the visual system have been found consistent with human perception, but they have not been compared directly with neural responses. Here, we analyze the information that retinal ganglion cells transmit to the brain about the spatial information in natural images subject to three resource constraints: the number of retinal ganglion cells, their total response variances, and their total synaptic strengths. We derive a model that optimizes the transmitted information and compare it directly with measurements of complete functional connectivity between cone photoreceptors and the four major types of ganglion cells in the primate retina, obtained at single-cell resolution. We find that the ganglion cell population exhibited 80% efficiency in transmitting spatial information relative to the model. Both the retina and the model exhibited high redundancy (∼30%) among ganglion cells of the same cell type. A novel and unique prediction of efficient coding, the relationships between projection patterns of individual cones to all ganglion cells, was consistent with the observed projection patterns in the retina. These results indicate a high level of efficiency with near-optimal redundancy in visual signaling by the retina.
Introduction
The computations performed by neural circuits are essential for survival but come at a cost. It has been hypothesized that the early stages of sensory processing have evolved to accurately encode environmental signals with the minimal consumption of biological resources (Attneave, 1954; Barlow, 1961; Atick and Redlich, 1990; van Hateren, 1992b; Laughlin, 2001; Chklovskii et al., 2002; Bialek et al., 2006). This theoretical hypothesis, generally known as efficient coding, has been used to explain a variety of observed properties of sensory systems (Laughlin, 1981; Srinivasan et al., 1982; Atick and Redlich, 1992; van Hateren, 1992a; Rieke et al., 1995; Dan et al., 1996; Olshausen and Field, 1996; Baddeley et al., 1997; Bell and Sejnowski, 1997; Machens et al., 2001; Schwartz and Simoncelli, 2001; Vincent and Baddeley, 2003; Chechik et al., 2006; Graham et al., 2006; Smith and Lewicki, 2006; Doi and Lewicki, 2007; Borghuis et al., 2008; Liu et al., 2009).
The retina provides a natural choice for the study of coding efficiency, given its role in transmitting visual information to the brain and the extensive literature documenting its anatomical and functional properties. Previous work showed that behavioral measurements of bandpass contrast sensitivity in the primate visual system (Kelly, 1972; De Valois et al., 1974) are generally consistent with efficient coding (Atick and Redlich, 1992; van Hateren, 1992b; Dan et al., 1996). However, it is still far from clear whether the specific organization of the retinal circuitry—consisting of distinct types of retinal ganglion cells (RGCs) (Masland, 2001), each blanketing the entire visual field with a lattice of irregularly shaped receptive fields (Gauthier et al., 2009)—is consistent with efficient coding. Although the patterns of spike trains observed in individual retinal neurons appear to reflect metabolically efficient information transmission (Balasubramanian and Berry, 2002; Koch et al., 2004), recent studies have shown significant redundancy between pairs of retinal responses (Meister et al., 1995; Puchalla et al., 2005; Schneidman et al., 2006; Shlens et al., 2006; Ala-Laurila et al., 2011), potentially at odds with coding efficiency (Puchalla et al., 2005; Ala-Laurila et al., 2011).
In this paper, we test the efficiency of spatial processing of visual signals that transforms the responses of cone photoreceptors to those of RGCs. Using data from a high-density multielectrode array, we measure the functional connectivity between the lattice of cones and multiple complete populations of RGCs. We then compare these with the proposed connectivity model optimized for transmitting spatial information in natural images subject to the same neural resources found in the measured retinal region. The results indicate high efficiency of the retinal circuitry and an accompanying redundancy of neural signals conveyed to the brain.
Materials and Methods
Physiological data.
We examined electrophysiological recordings obtained from three macaque monkeys based on segments of retina taken from regions at 27, 38, and 28 degrees of eccentricity from the fovea, respectively. Stimulus generation and calibration, spike identification, cell-type classification, and estimation of functional connectivity have been described by Field et al. (2010). Briefly, the connectivities were obtained by reverse correlating the measured spike trains of RGCs against the stimuli, which were spatiotemporal white noise with red, green, and blue monitor primary intensities drawn from a binary distribution. Pixel sizes (with side length ∼1.5 min of arc) were small enough that the spike-triggered average revealed the locations of individual cones as well as their identity: (L)ong, (M)edium, or (S)hort wavelength sensitive. The measurements of functional connectivity were restricted to L and M cones and to ON-Parasol, OFF-Parasol, ON-Midget, and OFF-Midget RGCs, identified by the spatiotemporal properties of their receptive fields. Each type tiled the region of retina examined, in all three datasets. The numbers of cones and RGCs in the three datasets were, respectively, {706,131}, {520,89}, and {569,92}, corresponding to a cell ratio of 5.8 ± 0.4 (mean ± SD). The ratios broken down by RGC type were 71.5 ± 6.7 (ON-Parasol), 70.2 ± 9.6 (OFF-Parasol), 13.7 ± 0.5 (ON-Midget), and 14.1 ± 1.7 (OFF-Midget). The results shown in Figures 2⇓⇓–5 were obtained with the first dataset. Consistent results were obtained with the other two datasets.
RGC response model.
For analysis of coding efficiency, we assume the following functional model of RGC responses: where s is an N-dimensional vector of cone responses, ν (input noise) is Gaussian white noise with variance σν2, W is an M × N matrix expressing the functional connectivity between cones and RGCs, and δ (output noise) is Gaussian white noise with variance σδ2. The resulting M-dimensional vector, r, represents the response of the RGCs. The model structure is similar to that of previous studies (Linsker, 1989; Atick and Redlich, 1990; Atick et al., 1990; Bialek et al., 1991; van Hateren, 1992b) but does not assume a regular lattice of cones. The connectivity matrix W is permitted to represent an inhomogeneous RGC population of arbitrary size.
The information transmitted by the RGC population was estimated by assuming a Gaussian probability model for the cone signal s. The empirical covariance of the cone responses, Cs, was computed as follows. First, a set of 62 calibrated achromatic natural images (Doi et al., 2003) was blurred (Fig. 1) according to the modulation transfer function of the human eye (Navarro et al., 1993) at 30, 40, and 30 degrees of eccentricity for the three retina datasets. Next, the retinal images were sampled using the physiologically measured cone mosaic, simulating photon absorption values across the cone lattice. These were transformed with a compressive cone nonlinearity followed by subtraction of the mean across stimuli (Baylor et al., 1987; Doi et al., 2003). For accurate covariance estimation, cone signals were sampled from 6,200,000 randomly selected image patches.
The model also includes input noise [capturing the effects of photon shot noise, phototransduction noise, and membrane noise in the cone (Srinivasan et al., 1982; Atick and Redlich, 1990; van Hateren, 1993; Ruderman, 1994)] and output noise [capturing noise introduced after the linear combination of cone responses, including synaptic noise, RGC membrane noise, and the loss of information in the conversion of synaptic currents to spikes (Srinivasan et al., 1982; Atick and Redlich, 1990; van Hateren, 1993; Ruderman, 1994; Dhingra and Smith, 2004)]. The noise variances, σν2 and σδ2, were selected to produce signal-to-noise ratios (SNRs) of 1 and 10 (corresponding to 0 and 10 dB), respectively. The input SNR was defined as ∑j=1N Var(sj)/Nσν2, where N is the number of cones and sj is the jth cone signal, and the output SNR was defined similarly as tr(WCsWT + σν2WWT)/Mσδ2 [note that the numerator is the sum of variances of RGC responses before output noise is added, W(s + ν)]). These choices are not strongly constrained by currently available measurements. However, perturbing these SNR values by ±10 dB produced minor changes in the results.
Given the linear-Gaussian RGC response model, the mutual information between cone signal s and the RGC response r can be computed explicitly for any given connectivity matrix W (Atick and Redlich, 1990; Atick et al., 1990; van Hateren, 1992b; Campa et al., 1995): where I is the identity matrix.
The information present in the cone responses is computed as the mutual information between the cone signal, s, and the noise-corrupted cone responses, s + ν:
Efficient coding solution.
The connectivity matrix W that maximizes the transmitted information (Eq. 2) was derived subject to three constraints. First, the size of W was chosen to match that of the physiological connectivity matrix Wret (i.e., numbers of cones and RGCs). Second, the total response variance, was constrained to match that of Wret. Third, the total squared synaptic strength, was constrained to match that of Wret. Although each of these constraints may be found in previous literature, the present model is the first to include all three. In particular, most previous studies (Atick and Redlich, 1990; Atick et al., 1990; van Hateren, 1992b; Haft and van Hemmen, 1998) assumed only the total response variance constraint without matching cell numbers to physiological data, and Campa et al. (1995) provided the analysis for arbitrary cell numbers but convergent cell ratio (M ≤ N). Our analysis is more general in that the cell ratio may also be divergent (M > N) and in the inclusion of two additional constraints. We have shown that those constraints play an important role in shaping the solution (Doi et al., 2010).
The optimal connectivity is computed by solving a constrained optimization problem for W that maximizes Equation 2 subject to the two equality constraints (Eqs. 4, 5). The conventional procedure of rewriting the objective function using Lagrange multiplier terms for the two equality constraints was adopted (Chong and Zak, 2001). The resulting problem is not easily solved, because the primary objective function (Eq. 2) is not convex with respect to W. The following analysis transforms the problem into a convex one, thus guaranteeing a globally optimal solution for W.
The connectivity matrix can be reexpressed as W = PΩQT, using the singular-value decomposition (Strang, 2005), in which the first and third matrices are orthogonal and the middle one diagonal. First, the first orthogonal matrix, P, does not affect the values of either the objective function (Eq. 2) or the two constraints (Eqs. 4, 5) and thus can be chosen arbitrarily (see below, Best-fitting solutions). Second, it can be shown that, for the optimal connectivity, the second orthogonal matrix, Q, should be set to the eigenvector matrix of the signal covariance matrix (Campa et al., 1995). This implies that the signal is first represented in the coordinates of its principal axes, as in principal component analysis (Zhaoping, 2006). Once represented in this coordinate system, the signal is modulated along the axes (via the diagonal matrix Ω) and finally represented with the new basis functions (columns of P) with dimension equal to the number of RGCs. What remains is to optimize the diagonal entries of Ω, denoted as ωi. Thus, the objective function (Eq. 2) is now reduced to a concave function with respect to the squares of those diagonal entries:
One can show that the second derivative of Equation 6 with respect to ωi2 is always strictly negative. It is useful to note that
It is important to note that the efficient coding solution, Wopt, is not a whitening matrix except for the special case in which the input noise is zero and the resources are solely constrained by the total response variance. Because the input noise of the retina is significant (Ala-Laurila et al., 2011) and the synaptic weights in the retina are naturally assumed to have a direct bearing on the cost of synaptic resource usage, Wopt will never be a whitening matrix for the retinal transform.
Best-fitting solution.
A set of connectivity matrices that are equally optimal is given by PWopt, where P is an orthogonal matrix, and Wopt is an arbitrarily selected optimal connectivity matrix. We obtained Wopt with PrndΩoptQoptT, where Prnd is a left orthogonal matrix of the singular value decomposition of an M-dimensional matrix with elements randomly drawn from the normal distribution, and Ωopt and Qopt are the optimal components of W as defined in the previous section. The best-fitting orthogonal matrix for the optimal connectivity matrix, Pfit, is given by the minimizer of the squared error, ε(P) = ||PWopt − Wret||F2, where ||… ||F denotes Frobenius norm (the sum of squares of the matrix entries). This type of optimization is known as the orthogonal Procrustes problem and can be solved in closed form (Gower and Dijksterhuis, 2004). We reported the squared error relative to the data variance, ε/||Wret||F2.
Each RGC receptive field outlined in Figure 2 is the effective linear weighting that maps visual stimuli to RGC response. This is constructed by convolving the receptive fields of individual cones (depicted by small circles in Fig. 1) with the point spread function of the eye at the relevant eccentricity (Navarro et al., 1993) and then summing all these cone profiles with the weights specified by the connection matrix entries for that RGC.
Boundary handling.
Most of the analyses were conducted without special handling of the boundaries of the recording. Exceptions are as follows. In Figure 2, those cones on the boundary of the retinal patch were excluded. To solve for Pfit, several RGCs on the boundary of the retinal patch were excluded. In this case, Pfit is rectangular with the row vectors orthogonal to each other, and a standard solution for the orthogonal Procrustes problem cannot be used (Gower and Dijksterhuis, 2004). Thus, Pfit was obtained numerically by iterating the gradient descent to minimize the squared error and the orthogonalization of rows of Pfit. The analysis was repeated without this boundary handling, producing similar but noisier results.
Unique prediction about connectivity.
The family of efficient connectivity matrices Wopt, which differ from each other only by an orthogonal transformation, provides a novel theoretical prediction that can be compared with data: the uniquely specified matrix Z = WoptTWopt, for any given Wopt. This matrix is a unique prediction of efficient coding, because it is determined solely by the two unique optimal components, Ωopt and Qopt, and is invariant to the choice of the orthogonal matrix P [because (PWopt)T(PWopt) = WoptTPTPWopt = WoptTWopt, for any orthogonal P]. The individual elements Zij of Z represent the inner product of the ith and jth columns of the connectivity matrix, which contain the weights of the ith and jth cone projective fields (PFs). If i = j, then Zij indicates the squared strength (or norm) of the PF of the ith cone.
Definition of redundancy.
A general form of redundancy that quantifies the informational overlap in a neural population is the sum of the information transmitted by disjoint subpopulations of neurons (e.g., individual neurons), ri, minus the information transmitted jointly by the population formed from their union, {ri; i = 1, …, M}, The negative of this quantity is also referred to as synergy (Gawne and Richmond, 1993; Brenner et al., 2000; Machens et al., 2001; Schneidman et al., 2003; Latham and Nirenberg, 2005).
The portion of information that is uniquely conveyed by the kth neuron is given by where r is the responses of the full neural population, and r¬k is the responses of the same population, with the kth neuron removed. The portion of information conveyed by the kth neuron that is also conveyed by all the other neurons in the population is thus given by referred to here as the single-cell redundancy. Note that this is a special case of Equation 9 because the union of rk and r¬k is r. In Results, the ratio of single-cell redundancy to the information conveyed by the single neuron is reported, ΔIsc(k)/I(s;rk).
The pairwise redundancy shown in Figure 4 is also given by Equation 9, with the population consisting of two neurons. The quantity reported in Figure 4 is normalized in accordance with previous work (Puchalla et al., 2005): for which the maximum possible value, corresponding to a completely redundant pair, is 1 (Machens et al., 2001).
To gain insights into efficient coding, it is also useful to examine the redundancy of Equation 9 for the full set of individual neurons within a population. The information transmitted by the entire population can be expressed as the sum of information transmitted by individual neurons, minus the redundancy: This implies that maximizing information, I(s;r), is a tradeoff between maximizing the sum of transmitted information by individual neurons (the first term on the right side) and minimizing the redundancies between them (the second term). This tradeoff has been discussed previously with a different definition of redundancy (Borghuis et al., 2008; Balasubramanian and Sterling, 2009). Note that Equation 12 makes it explicit that redundancy reduction is not equivalent to information maximization.
Simple developmental model of retinal connectivity.
We simulated a developmental model (see Fig. 5) to obtain an alternative connectivity matrix, W. The elements of this matrix, Wij, were adjusted using an iterative learning rule with initial conditions. This iteration was implemented to achieve two goals.
(1) Response variance, σ2, should equal the average variance of RGCs with the connectivity Wret in response to natural images. The target variance was set for each individual RGC type separately. In each iteration, the value Wij (connectivity from the jth cone to the ith model RGC) was incremented by a local update rule: where σi2 is the response variance of the ith neuron, sj and νj are, respectively, the signal and noise of the jth cone, ui is the response of the ith neuron, and 〈…〉 indicates the ensemble average over the presentation of natural images.
(2) Magnitude of the cone PF within each RGC type, φ, should equal the average magnitude of the PF in the measured connectivity Wret. A constant magnitude of cone PF per RGC type ensures that each RGC type uniformly samples the cone lattice without gaps, tiling the region of retina (Gauthier et al., 2009). In each iteration, Wij was incremented by another local update rule: where φj is the PF norm of the jth cone. Importantly, the optimal connectivity exhibits nearly constant PF norm for an entire RGC population (and we proved that this is exactly constant in the ideal case of a regular cone lattice with shift-invariant natural images). Hence, this biologically plausible rule leads to connectivity that satisfies a necessary condition for optimal information transmission.
The connectivity matrix was initialized by the model receptive fields (rows of W) with spatially localized Gaussian profiles with center locations taken from the data and SD equal to half the distance to the nearest RGC of the same type (Devries and Baylor, 1997). This effectively prohibited long-range connections, because the objective function is non-convex (fourth power of Wij) and has local minima. Iterative adjustment of the entries of W terminated when both target values in goals 1 and 2 were achieved simultaneously. These conditions satisfy the constraints of total response variance and squared synaptic strength, respectively, and hence allowed a fair comparison of the resulting connectivity with retinal (Wret) and efficient (Wopt) connectivity matrices.
Results
The circuitry of the retina transforms the visual information captured by a cone photoreceptor mosaic into the electrical signals in multiple types of RGCs, which is then transmitted to the brain. We compared the spatial properties of a linear approximation of this transformation, measured at single-cell resolution, against predictions of efficient coding theory.
Measuring and modeling spatial processing in the retina
The spatial transformation from cone to RGC responses was measured using multielectrode recordings of peripheral macaque monkey retina ex vivo (Field et al., 2010). These recordings sampled the electrical activity of complete populations of the four numerically dominant primate RGC types: ON-Parasol, OFF-Parasol, ON-Midget, and OFF-Midget. Fine-grained visual stimulation was used to measure the spatial receptive fields of complete populations of these RGCs at the resolution of individual cones. These measurements quantified the strength of functional connection from every cone to every recorded RGC over a region of the retina.
The predictions of efficient coding were derived using a simplified response model, constructed to be comparable with the data while incorporating the statistical properties of natural images, noise, and biological constraints (Fig. 1). Achromatic natural images were obtained from a database (Doi et al., 2003), blurred according to the optics of the eye (Navarro et al., 1993), and represented in terms of the elicited photon absorptions of cones laid out in an irregular lattice as measured using physiological data (Field et al., 2010). These model cone signals, transformed by an instantaneous compressive nonlinearity (Baylor et al., 1987; Doi et al., 2003) and corrupted by noise, were combined linearly to produce model RGC signals. Finally, model RGC signals were corrupted with additive independent noise. The free parameters of the model were the strengths of inputs from all the model cones to all the model RGCs, summarized in a connectivity matrix W. By construction, W is directly comparable with the physiologically measured weights of cone inputs to RGCs, Wret. To test the predictions of efficient coding for the retinal circuitry, Wret was compared with an optimal connectivity matrix, Wopt, that was numerically optimized for information transmission. This optimization was performed subject to three resource constraints relevant to the retinal circuitry (see Materials and Methods): (1) number of RGCs (Campa et al., 1995; Doi and Lewicki, 2007); (2) total response variance of RGCs (Atick and Redlich, 1990, 1992; Atick et al., 1990; van Hateren, 1992b, 1993; Ruderman, 1994; Haft and van Hemmen, 1998; Doi and Lewicki, 2007); and (3) total squared synaptic strengths (Campa et al., 1995).
Coding efficiency of the retina
How efficiently does the retina process the spatial information in natural images? To answer this question, information transmission was calculated for two different model RGC populations: one with the physiologically measured connectivity (Wret) and the other with optimal connectivity subject to resource constraints (Wopt). Comparison of these values indicates the degree to which the retinal connectivity is efficient. In three recordings from different retinas, the retinal connectivity preserved 59.4 ± 8.1% (mean ± SD across datasets) of the visual information present in the cone lattice (defined by the information about cone signal that is gained after input noise is added; see Materials and Methods) compared with 74.8 ± 6.7% preserved with the optimal connectivity. Thus, RGCs transmit a large fraction of the visual information possible, exhibiting an overall efficiency of ∼80% of the maximum possible (82.4, 81.2, and 74.2%, respectively, for three datasets).
Receptive field organization
Does the retinal circuitry exhibit spatial structure consistent with efficient coding? A direct comparison of the measured RGC receptive fields (rows of Wret) with the optimal receptive fields (rows of Wopt) is not informative, because the optimal connectivity matrix Wopt is not uniquely specified by efficient coding. Specifically, multiplying a connectivity matrix by any orthogonal matrix P yields a new connectivity matrix that uses the same resources and transmits the same amount of information (see Materials and Methods). Thus, spatial receptive field structure does not provide a unique test of efficient coding.
A partial test of efficiency can be developed by finding, within the family of optimal connectivity matrices Wopt, the single connectivity matrix Wopt-fit that most closely matches the data (Fig. 2a). Mathematically, finding this matrix is equivalent to starting with any choice of Wopt and finding an orthogonal matrix P that minimizes ||Wret − PWopt||2. A close match between Wret and Wopt-fit would indicate efficient spatial structure of retinal receptive fields. The results show that indeed the receptive fields in Wopt-fit are similar to those in the retina (Fig. 2b,c). In three datasets, the squared error of Wopt-fit was 41.5 ± 10.5% of the sum of squared weights of Wret (34.3, 36.6, and 53.5%, respectively, for three datasets). For comparison, the receptive fields obtained with a randomly selected orthogonal matrix P (Fig. 2d) differ substantially from the measured receptive fields, with squared errors of 193.8 ± 2.4% (mean ± SD across three datasets, 100 samples each).
Projective field organization
Given the non-uniqueness of optimal receptive field structure (above), an incisive test of efficiency would ideally focus on an aspect of retinal circuitry that is both necessary and sufficient for optimality. We find that such unique predictions of efficient coding are given in terms of weights on the signals flowing from a particular cone to all the RGCs (across all RGC types), which we refer to as the projective field (PF) of the cone (Lehky and Sejnowski, 1988). More specifically, the unique spatial predictions of efficient coding are the squared magnitudes of the PF of each cone (i.e., strength of the diverging signal) and the similarity between the PFs of different cones (i.e., spatial overlap in their projections to RGCs) (see Materials and Methods).
The complete connectivity maps obtained in the physiological data (Field et al., 2010) provide the first opportunity to compare the spatial structure of cone PFs to the predictions of efficient coding. Figure 3a shows the inner products of the PF of one cone with the PFs of other cones, computed for retinal (Wret) and optimal (Wopt) connectivity matrices. In both cases, the inner products assume high positive values for nearby cones (indicating similarity of PFs), smaller negative values for surrounding cones (dissimilarity), and near-zero values for more distant cones. Figure 3b shows that this trend holds for all cone pairs: although the values obtained from the physiological data are significantly more variable than for the optimal solution, the average PF inner products as a function of distance are consistent with the theory.
Redundancy
Is efficient coding consistent with the redundancy observed previously in RGCs (Meister et al., 1995; Puchalla et al., 2005; Schneidman et al., 2006; Shlens et al., 2006; Ala-Laurila et al., 2011)? Because redundancy means that a portion of the information transmitted by one neuron is also transmitted by others (Gawne and Richmond, 1993; Brenner et al., 2000; Machens et al., 2001; Schneidman et al., 2003; Latham and Nirenberg, 2005), one might intuitively expect that a redundant code must be inefficient. Indeed, redundancy reduction has often been stated as an objective synonymous with efficient coding, and, in some special cases, this is correct (Barlow, 1961; Atick and Redlich, 1990; Atick et al., 1990; van Hateren, 1992b; Bell and Sejnowski, 1997). In other cases, however, redundancy can serve to overcome the deleterious effects of noise, improving information transmission (Atick and Redlich, 1990; Atick et al., 1990; van Hateren, 1992b; Barlow, 2001; Zhaoping, 2006; Doi and Lewicki, 2007; Borghuis et al., 2008; Tkacik et al., 2010). This raises the possibility that the redundancy found in the retina is consistent with efficient coding.
As a measure of redundancy, we estimated the fraction of the spatial information conveyed by a single RGC that is also conveyed by other RGCs of the same type (see Materials and Methods). The redundancy associated with the connectivity in the retina (Wret) was 28.7 ± 7.8% (mean ± SD for each cell type, three datasets), whereas the redundancy associated with efficient coding (Wopt-fit) was 26.3 ± 11.5%. Although substantial, both of these were much lower than the 86.5 ± 3.7% redundancy in the cone lattice, analogous to the previous findings of redundancy reduction in the successive stages of auditory sensory systems (Chechik et al., 2006). Also, consistent with previous reports of correlated activity in the retina (Mastronarde, 1989; Meister et al., 1995; Puchalla et al., 2005; Shlens et al., 2006; Ala-Laurila et al., 2011), redundancy between pairs of neighboring cells of the same type was high (up to 30%) and declined with distance, for both the retina and the efficient coding model (Fig. 4). In summary, we found that the degree and spatial organization of redundancy in the retina closely matched the predictions of efficient coding.
Discussion
A detailed test of the efficiency of spatial information coding in the retina was made possible by two advances. First, we developed a tractable model that allowed the computation and optimization of transmitted information in inhomogeneous neural circuits, with multiple constraints on biological resources. Second, we made use of new experimental data and analyses that allow determination of complete functional connectivity between populations of cones and RGCs. The combination of these approaches yielded several new findings. First, under these resource constraints, we find that the retina transmits ∼80% of the maximally achievable spatial information about natural images. Second, the functional connectivity between cones and RGCs exhibits unique spatial structure, as captured by PF inner products, consistent with coding efficiency. Finally, the redundancy of spatial information encoded by RGCs has the degree and spatial organization expected from an efficient code.
Previous work has shown that behavioral measurements of visual sensitivity in humans exhibit a bandpass spatial characteristic and changes with light level that are broadly consistent with efficient coding theory (Atick and Redlich, 1990, 1992; van Hateren, 1992b, 1993). Although this result was interpreted in terms of the prototypical center-surround receptive field structure of RGCs, it provided no means to directly compare with physiological measurements. In addition, the theoretical formulation assumed a homogeneous population of rotationally symmetric receptive fields, laid out on a uniform lattice, and equal in number to the cones. In contrast, our formulation (Doi et al., 2010) incorporates much of the variability and irregularity observed in real retinas, including the mismatch in sizes of the populations of cones and RGCs. The use of response variance as a constraint to account for the metabolic cost of spike generation may be found in several previous studies (Atick and Redlich, 1990, 1992; Atick et al., 1990; van Hateren, 1992b, 1993; Ruderman, 1994; Doi and Lewicki, 2007). However, we included an additional constraint on total squared strength of connectivity, reflecting the cost of synaptic maintenance and transmission; this has a significant effect on shaping the solution (Doi et al., 2010). [We chose the L2 norm constraint for connectivity weights because of its analytical tractability (Campa et al., 1995), although the L1 norm constraint may be a more natural choice (Vincent and Baddeley, 2003; Vincent et al., 2005).] Incorporating these constraints, as well as the measured organization of the cone lattice, made it possible to derive and test the theoretical predictions of efficient coding directly in individual retinas.
The present work reveals two novel aspects of efficient coding. First, the theory shows that the necessary and sufficient empirical predictions of efficient coding relate to PF structure rather than receptive field structure. Traditional measurement approaches do not reveal PFs, but the physiological measurement of complete functional circuitry presented here made it possible to test this key theoretical prediction directly. Second, the significant spatial redundancy found among RGCs (∼30%) is consistent with the predictions of efficient coding (cf. Puchalla et al., 2005; Ala-Laurila et al., 2011). Although previous theoretical work has shown that efficient coding can lead to redundant representations (Atick and Redlich, 1990; Atick et al., 1990; van Hateren, 1992b; Barlow, 2001; Zhaoping, 2006; Doi and Lewicki, 2007; Borghuis et al., 2008; Tkacik et al., 2010) and experimental work has shown that the retinal signals are redundant (Meister et al., 1995; Puchalla et al., 2005; Schneidman et al., 2006; Shlens et al., 2006; Ala-Laurila et al., 2011), the results presented here provide the first direct test of the consistency between these theoretical predictions and experimental results.
The high degree of efficiency exhibited by the retina is presumably achieved through a combination of genetic, developmental, and homeostatic mechanisms. It seems unlikely that such mechanisms could be orchestrated to directly optimize information transmission, as we have done in optimizing our model. However, it is natural to ask whether a simpler and more plausible set of constraints might provide a sufficient proxy. Toward this end, a “developmental” model was considered (see Materials and Methods) based on three constraints: (1) the response variances should be constant across RGCs; (2) the PFs of cones to a given RGC type should have fixed magnitude; and (3) long-distance connections between cones and RGCs are prohibited. All three can be plausibly optimized using local learning rules, and all three constraints are consistent with the optimally efficient solution, as well as the regular and uniform arrangement of retinal circuitry. Simulations of this model lead to receptive field structure and organization (Fig. 5), coding efficiency (82.4%), and redundancy (26.1 ± 11.0%), similar to those observed in the data. We conclude that the retina could, in principle, achieve efficient information transmission and the associated redundancy using simple developmental mechanisms.
Several significant limitations in the analysis of efficiency could be addressed in future work. First, our model does not account for temporal properties of neural response or temporal structure in natural scenes. Inclusion of temporal domain information might help to explain the existence of multiple types of RGCs (van Hateren, 1992a; Dong and Atick, 1995). Second, the theory was made tractable by assumptions of linear processing, additive Gaussian noise, and Gaussian signal statistics. All of these assumptions are contradicted to some degree by empirical findings, but substantial advances in analytical methods will be required to incorporate them into the theory (but see Borghuis et al., 2008; Ratliff et al., 2010; Karklin and Simoncelli, 2011; Rahnama Rad and Paninski, 2011; Pitkow and Meister, 2012). Finally, the predictions of efficient coding were tested only in the retina, a neural circuit with unique experimental accessibility that makes high-resolution and complete measurements possible. However, the theory is general and will undoubtedly be investigated in other neural structures as advances in measurement technology permit.
Footnotes
This work was supported by National Institutes of Health Grant EY018003 (L.P., E.J.C., and E.P.S.) and Howard Hughes Medical Institute (E.P.S.).
The authors declare no competing financial interests.
- Correspondence should be addressed to Eizaburo Doi, Glennan Room 321, 10900 Euclid Avenue, Cleveland, OH 44106. edoi{at}case.edu
This article is freely available online through the J Neurosci Open Choice option.