## Abstract

Coupling between sensory neurons impacts their tuning properties and correlations in their responses. How such coupling affects sensory representations and ultimately behavior remains unclear. We investigated the role of neuronal coupling during visual processing using a realistic biophysical model of the vertical system (VS) cell network in the blow fly. These neurons are thought to encode the horizontal rotation axis during rapid free-flight maneuvers. Experimental findings suggest that neurons of the VS are strongly electrically coupled, and that several downstream neurons driving motor responses to ego-rotation receive inputs primarily from a small subset of VS cells. These downstream neurons must decode information about the axis of rotation from a partial readout of the VS population response. To investigate the role of coupling, we simulated the VS response to a variety of rotating visual scenes and computed optimal Bayesian estimates from the relevant subset of VS cells. Our analysis shows that coupling leads to near-optimal estimates from a subpopulation readout. In contrast, coupling between VS cells has no impact on the quality of encoding in the response of the full population. We conclude that coupling at one level of the fly visual system allows for near-optimal decoding from partial information at the subsequent, premotor level. Thus, electrical coupling may provide a general mechanism to achieve near-optimal information transfer from neuronal subpopulations across organisms and modalities.

## Introduction

Flying organisms require fast, reliable feedback regarding ego-motion. This information is extracted from the optic flow; the motion of the external world as perceived by the organism (Lee and Kalmus, 1980; Borst and Bahde, 1988). In the visual system of the fly, neurons of the lobula plate receive as input a two-dimensional, retinotopic representation of the optic flow, allowing them to encode rotational and translational velocities (Borst and Haag, 2002; Borst and Weber, 2011). The lobula plate serves as a primary relay between early vision and downstream motor centers (Strausfeld and Bassemir, 1985; Haag et al., 2007; Wertz et al., 2008).

Approximately 60 large tangential cells responsive to wide-field motion have been identified within the lobula plate of each hemisphere of the blow fly (Hengstenberg, 1982; Hausen, 1984). Ten of these neurons comprise the vertical system (VS) which is thought to encode the azimuthal direction of rotations in the horizontal plane (Fig. 1*A*; Krapp and Hengstenberg, 1996). These cells were the focus of our study.

The visual information available to the fly is rich, but only part of it is essential to control flight. The VS cells encode an essential parameter from this complex input; the horizontal axis of ego-rotation. Estimating this parameter is a problem of marginalization, as the quantity of interest must be disentangled from irrelevant visual information (Fig. 1*B*).

Electrical coupling between adjacent VS cells shapes their responses (Haag and Borst, 2004; Farrow et al., 2005). Our goal was to examine the impact of coupling on the representation of the azimuthal angle of rotation in the VS population response. Our results build on previous qualitative observations about the role of coupling (Cuntz et al., 2007; Weber et al., 2008; Elyada et al., 2009).

To examine the role of coupling we presented random rotating images (Fig. 1*C*) as input to a biophysically plausible model of the VS cell network (Borst and Weber, 2011). In contrast to previous studies, we considered transient responses and applied probabilistic modeling methods to compute optimal Bayesian estimators from VS activity instead of using heuristic or suboptimal estimators.

Anatomical and electrophysiological studies of lobula plate neurons have characterized a pair of premotor neurons at the next stage of processing of the fly's nervous system. The strongest projections of the VS population onto these descending neurons originate from a subset of the VS population (Wertz et al., 2009a). When considering such a partial readout, we found that coupling-induced changes in tuning, correlations, and reliability were crucial for an accurate representation of the angle of rotation. Surprisingly, we also found that the quality of the optimal estimate from the collective response of VS cells does not depend on coupling strength.

Gap junction coupling between VS cells can thus impact the accuracy of a subpopulation readout, and hence the fly's ability to navigate. Our results suggest that across species and modalities electrical coupling can distribute information across a neural population, and significantly increase the performance of estimates extracted from a subpopulation response.

## Materials and Methods

##### Model of the VS network.

Our study is based on a model of the VS tangential cells closely related to that of Borst and Weber (2011). We briefly describe the model, and note the differences between the specific implementations. Parameters not explicitly stated, and details of the model not discussed are identical to those given by Borst and Weber (2011).

To mimic stimulation of the fly visual system under a variety of conditions, including natural flight, we started by projecting a random image onto the surface of a sphere. We considered images that consist of randomly arranged bars of varying sizes, as well as random checkerboard images and compositions of natural scenes. Spherical images were rotated about a horizontal axis with varying azimuthal angle, thereby generating a pattern of optic flow (Fig. 2*A*). Details on image and optic flow stimulus generation are given in the next section.

Processing of these stimuli by the fly visual system is captured by several successive computational steps. The rotated image sequences (“optic flow stimuli”) were first filtered by an array of vertically oriented local motion detectors (LMDs or “Reichardt detectors”; Reichardt, 1987; Borst et al., 2003; Haag et al., 2004). The LMDs were spaced approximately evenly on the surface of the sphere. There were 5000 detectors per hemisphere, corresponding approximately to the number of facets on the left and right eyes (Hengstenberg, 1992). The input to a single detector was composed of luminance signals from two vertically aligned pixels separated by an elevation of 2°. First-order filtered low- and high-pass versions of the input from the two pixels were cross-multiplied and then subtracted (Fig. 2*B*). A negative (respectively positive) detector response reflected upward (respectively downward) motion. The downward and upward components were separately weighted according to the dendritic receptive field (RF) of each cell, and activated the dendritic compartment of the VS model neurons as excitatory and inhibitory conductances, respectively. Dendritic RFs were vertically centered Gaussians, with horizontal width 15° and vertical width 60° (Fig. 2*C*). Hence, each cell effectively sampled the entire vertical surround above and below its RF center, given in Figure 1*A*. Maximal vertical velocity within a cell's RF was generated by azimuthal rotation angles approximately orthogonal to the centers of the RF. Such rotations therefore resulted in maximal excitation or inhibition.

The axonal compartments of adjacent, ipsilateral VS neurons are electrically coupled to each other. Figure 2*D* shows a schematic of the model processing stages. Figure 2*E* shows the response of each VS neuron to downward stimulation in a narrow (10° wide) vertical strip which was swept across the visual field. The vertical strip contained a square-wave horizontal grating with a spatial frequency of 22.5° drifting downward at a constant velocity of 125°/s, corresponding to a temporal frequency of 5.5 Hz.

The axonal and dendritic membrane potentials for the VS neurons in each hemisphere evolve according to the following:
Here **V**_{Ax} and **V**_{De} are vectors whose entries correspond to the 10 axonal and dendritic voltages, respectively. The full VS model consists of two copies of this system, representing the activity of the system in the left and right hemispheres. The two differ only in their RF centers (Figs. 1*A*, 2*C*). The parameter *g*_{Ax-Den} sets the conductance for the coupling of the axonal and dendritic compartments of each neuron, whereas *C*_{m} is the membrane capacitance and *g*_{L,Ax}, *g*_{L,De} are the leak conductances of each compartment. The membrane time constant of each compartment is: τ* _{X}* =

*C*

_{m}/

*g*

_{L,}

*,*

_{X}*X*= Ax, De. The resting potential of each compartment is zero. Following a perturbation from rest, the membrane potential decays exponentially back to the resting potential with a characteristic timescale τ

*. Intrinsic variability is modeled by standard white noise processes ξ*

_{X}_{Ax}(

*t*), ξ

_{De}(

*t*), and σ

_{Ax}, σ

_{De}, set the noise intensities.

The input currents to the dendrite of cell *i* result in the term *I _{i}*(

*t*) =

*E*

_{E}

*g*

_{E,}

*(*

_{i}*t*), where

*g*

_{E,}

*(*

_{i}*t*) is the excitatory conductance to cell

*i*induced by the optic flow stimulus,

*E*

_{E}is the associated reversal potential, and likewise for the inhibitory quantities

*g*

_{I,}

*(*

_{i}*t*) and

*E*

_{I}. The matrix

*G*

_{De}(

*t*) is diagonal with entries describing the leak conductance, axonal coupling, and input currents as follows: The matrix

*G*

_{Ax}has entries given by: Here,

*g*

_{gap}sets the strength of the axo-axonal gap junction coupling between adjacent, ipsilateral VS neurons. One difference between our simulation protocol and that of Borst and Weber (2011) is that we generated visual inputs at time steps of 1 ms, but integrated Equation 1 at a smaller time step of 0.01 ms to guarantee numerical accuracy. We first calculated the conductances elicited by the optic flow stimulus at the coarser time step, then linearly interpolated to obtain a realization of the conductance at the finer timescale. Typical responses of the uncoupled and coupled systems to rotation of a random bar image at θ

_{stim}= 90° are shown in Figure 3

*A*,

*B*, respectively. Central to the ability of the VS population to encode the axis of rotation is the strong, nonlinear dependence of a VS neuron's response on the rotational velocity of the visual stimulus within its RF.

When we consider the encoding of the rotation axis in the VS axonal responses, we take the output of the system to be temporal averages of the axonal membrane potential. For transient responses, the VS output is as follows:
In particular, when considering transient responses, we assume the system starts from rest (0 mV) at the beginning of the period over which we average. Similarly, steady-state responses were computed as follows:
In contrast to the transient response defined above, the steady-state response is defined so that, at the beginning of the integration period (τ_{ss}), the entire VS system is (approximately) in steady-state. Despite the fast time constants of VS model neurons, they do not immediately reach steady-state, because it takes some time for the motion detector-filtered stimulus received by the VS dendrites to equilibrate. In Figure 3, the shaded boxes indicate the periods over which we calculated the transient and steady-state responses.

It has been observed that there is a mutually inhibitory interaction between the end cells (VS1 and VS10) in each hemisphere. This may be implemented by electrical coupling of VS7–10 to an inhibitory cell Vi which forms a chemical synapse onto the ipsilateral VS1 cell, and electrical coupling of VS1 to an inhibitory cell Vi2 which forms a chemical synapse onto the ipsilateral VS10 cell (Haag and Borst, 2007; Borst and Weber, 2011). Following Weber et al. (2008), we implemented this “repulsive” coupling using a negative-conductance gap-junction between VS1 and VS10 (*g*_{inh} in Eq. 2). This repulsive coupling was scaled when we changed the strength of axo-axonal gap junction coupling between VS neurons. Unless otherwise specified, we set *g*_{inh} = −0.06 *g*_{gap}. Our findings do not depend qualitatively on the presence of this connection (results not shown).

For simplicity, we did not model several known functional and anatomical properties of VS cells, such as the rotational structure of their RFs (Krapp and Hengstenberg, 1996; Krapp et al., 1998) or dendrodendritic connections with the dCH neuron (Haag and Borst, 2007). These properties of the VS network are the subject of current experimental investigations (Hopp et al., 2014). Although we do not expect them to significantly affect our conclusions, investigating the impact of such VS cell features is an important avenue for future investigation. Previous computational studies of the VS network have made similar simplifying assumptions (Karmeier et al., 2005; Cuntz et al., 2007; Weber et al., 2008; Elyada et al., 2009, 2013).

##### Generation of images and optic flow patterns.

Optic flow patterns were generated by first projecting various types of random images onto the surface of the unit sphere. We considered three classes of random images: random bars, random checkerboards, and natural scenes (Fig. 1*C* shows examples of each type of image). In the first two cases, images were binary, pixel intensities were either 0 or 1, and for natural scenes, pixel intensities varied continuously between 0 and 1. All images were discretized at 1° increments in spherical coordinates. Throughout, images were generated independently across trials.

Random bar images were parameterized by the number of bars, as well as bar width and length. Each bar was generated by first randomly placing an initial line segment of the specified bar width on the surface of the unit sphere. We then expanded this segment along the direction of the length of the bar by rotation about the appropriate axis, turning “on” all pixels along the path touched by the rotating segment. Bar images used in Results (see Figs. 5⇓–7) contained 25 bars of length 40° and width 5°. Choosing bars with different dimensions or changing the number of bars did not affect the results qualitatively (data not shown).

For checkerboard images, we defined a coarse discretization of the image consisting of 4° × 4° squares, and randomly set all pixels within a square to be zero or one, independently across squares. Last, for natural images, we first took a subcollection of one hundred natural scenes from the van Hateren and Schilstra (1999) dataset. These images were selected to exclude man-made objects, sharp edges, and gratings. We then randomly selected six (with replacement) of these 100 images, projected them onto the sides of a cube, which was itself then projected onto a sphere, mimicking the approach of Borst and Weber (2011). For natural scene compositions, we also included an initial rotation of random magnitude about a randomly chosen (not generally horizontal) axis to control for the effects of edges between different natural scenes. We also confirmed our results with images chosen randomly from the van Hateren and Schilstra (1999) dataset without restrictions. Results were nearly identical (data not shown).

The sequences of images comprising optic flow patterns were generated by rotation of the sphere about an axis in the horizontal plane (we did not consider translatory motion). At a given positive time, the value of a pixel was set equal to the value of the pixel obtained by a reversal of the rotation applied to the original image. The rotational velocity was constant across simulations and set to 500°/s, falling well within the parameters of typical motion of the fly during flight (Egelhaaf et al., 2012). This value is also consistent with values considered in previous computational studies of the VS network (Karmeier et al., 2005; Cuntz et al., 2007; Weber et al., 2008; Elyada et al., 2009). Increasing or decreasing the rotational velocity to 250°/s or 750°/s did not affect the results quantitatively, nor change our general conclusions (data not shown).

##### Optimal linear and zero-crossing estimators.

To assess the ability of the VS cell network to encode the direction of rotation, we considered several estimators based on the axonal membrane potential of VS cells. In the following three sections, the reader should keep in mind that the random vector **V̄** is a surrogate for the time-averaged axonal response of VS cells. We will define the optimal linear estimator of the rotation axis based on the steady-state averaged axonal response of VS cells, **V̄**_{Ax}^{ss}, defined in Equation 4. In subsequent sections, we will also define the minimum mean-square estimator (MMSE) using the transient averaged axonal response, **V̄**_{Ax}^{tr}, defined in Equation 3.

For our analysis of the steady-state encoding of the axis of rotation, we applied an optimal linear estimator (OLE), the linear estimator which minimizes the expected value of the squared error over all stimuli and responses (Salinas and Abbott, 1994). We considered a linear, rather than an affine estimator because of the (near) rotational symmetry of the system. The OLE is simple and intuitive, and we used it to demonstrate the impact of gap junction coupling between VS cells. For a more detailed analysis we used the MMSE defined in the section entitled “Computation of the MMSE.”

We are interested in estimating the axis of rotation which is characterized by the unit vector, **s**, pointing along its direction. This vector is in the horizontal plane of the fly, with azimuthal angle θ_{stim}. The OLE is a linear combination of the responses of the *N* neurons, **V̄** = (*V̄*_{1},…,*V̄ _{N}*)

*(where*

^{T}*T*denotes transposition). To obtain the OLE, we denote the joint probability density of stimuli and responses by

*P*(

**V̄,s**) =

*P*(

**V̄**|

**s**)

*P*(

**s**). We assume throughout the following that the prior distribution over the stimuli,

*P*(

**s**), is flat. The tuning curve for the

*i′th*neuron is then: We also set

**L**

*=*

_{i}**E[s**μ

*(*

_{i}**s**)] = ∫

**s**μ

*(*

_{i}**s**)

*P*(

**s**)

*d*

**s**, and let

*L*= [

**L**

_{1}|…|

**L**

*] be the matrix with columns*

_{N}**L**

_{1},…,

**L**

*. We denote the second moment of the responses of the*

_{N}*i′th*and

*j′th*neurons (averaged across stimulus values) by Σ

*= ∫*

_{ij}*V̄*(

_{i}V̄_{j}P**V̄**|

**s**)

*P*(

**s**)

*d*

**s**

*d*

**V̄**. Given an observed response,

**V̄**

_{obs}, the OLE then has the following form: where Σ is the matrix of second moments, Σ

*. Note that the OLE requires measurement of only first and second moments of the response. It is thus considerably simpler to obtain than the true MMSE defined in the next section, which requires knowledge of the full distribution,*

_{ij}*P*(

**V̄**|

**s**). The optimal linear estimator and the MMSE coincide only when the joint distribution

*P*(

**V̄**,

**s**) is Gaussian, which is not generally true for the system we consider.

The stimulus **s** estimated by the vector **ŝ** was a unit vector pointing along the axis of rotation in the horizontal plane. However, the azimuthal angle, θ_{stim}, of **s** is the behaviorally relevant variable for optomotor control by the fly. We therefore report the angle θ̂ that the vector **ŝ** makes with the direction that the fly is facing, i.e., the azimuthal angle of **ŝ**, as in earlier analyses of similar directional stimuli (Georgopoulos et al., 1988; Salinas and Abbott, 1994; Lewis and Kristan, 1998).

We also implemented an estimator proposed by Cuntz et al. (2007) and Elyada et al. (2009) who suggested that the axis of rotation can be defined as a “zero crossing” of the population response. To obtain this estimator we defined for each cell a “zero angle”. Intuitively, this is the azimuthal rotation angle that is most likely to elicit no response for a given cell. Because rotation about the azimuthal angle coinciding with the center of a VS cell RF will elicit little downward and upward motion within the RF, the zero angles coincide precisely with the centers of the RFs shown in Figure 1*A*. Then, to estimate the “zero crossing” rotation angle given the response to a rotation about the stimulus angle, θ_{stim}, we search for consecutive pairs of VS neurons which exhibit a sign change in their responses (i.e., their membrane potential responses lie below and above their resting values, respectively). These neurons have the smallest responses to the stimulus, and hence the rotation angle is likely to lay between their respective zero angles. The axis of rotation is then estimated by linearly interpolating between the two zero angles based on the responses of these two VS neurons.

More precisely, we define the zero angle of the *i′th* cell on each side, θ_{i}^{0}, as the angle that maximizes the likelihood of giving a zero axonal response:
The posterior distribution of the azimuthal angle of the stimulus conditioned on the zero response of a VS cell will generally have two relative maxima, about 180° apart. To resolve this ambiguity we choose the angle at which voltage is increasing with increasing θ.

To implement the zero-crossing estimator, we first search for a pair of consecutive VS neurons which exhibit responses *V̄*_{1} < 0, *V̄*_{2} > 0, having zero angles θ_{1}^{0}, θ_{2}^{0}, where we assume the two cells are labeled so that θ_{2}^{0} − θ_{1}^{0} mod 360° ∈ [0°, 180°). A sufficient condition for such a pair to exist is existence of at least one pair of VS neurons exhibiting positive and negative responses, respectively. Once such a pair is located, the zero-crossing estimate θ_{est}^{ZC} is the angle associated with the zero potential level for the line connecting (θ_{1}^{0},*V̄*_{1}) and (θ_{2}^{0},*V̄*_{2}),

##### Modeling the joint distribution of VS axonal responses using copulas.

For our analysis of the encoding of the axis of rotation in the transient state, we applied the true MMSE. Obtaining this estimator of the rotation axis requires estimating the joint probability distribution of axonal membrane potential of VS cells given the stimulus. However, even with the benefit of modern computational power, it is not feasible to directly estimate probability distributions for continuous variates in more than a few dimensions. For this reason, we must first formulate an approximation of the joint probability distribution of VS axonal responses to implement the MMSE.

Two approaches to this problem are to fit a maximum entropy distribution that matches a set of empirical statistics of the data (Jaynes, 1957; Roudi et al., 2009; Shlens et al., 2009; Ohiorhenuan et al., 2010; Fairhall et al., 2012), or to apply copulas (Nelsen, 2006). We chose the latter approach, common in the valuation of financial derivatives, but not widely applied in neuroscience (cf. Berkes et al., 2009; Onken et al., 2009a,b). One advantage of the copula approach is that it allows us to use the empirical marginal probability distributions in the fit.

To fix ideas, we remind the reader that we will fit a copula to the probability distribution of the time-averaged transient VS axonal response, defined in Equation 3. Thus, we consider a random vector **V̄** = (*V̄*_{1},…,*V̄ _{N}*)

*with cumulative distribution function*

^{T}*F*(

*v̄*,…,

_{1}*v̄*). A copula for the distribution function

_{N}*F*is a function

*C*: ℝ

*→ℝ such that: Here*

^{N}*F*(·) is the marginal cumulative probability distribution function for the variable

_{i}*V̄*,

_{i}*i*= 1, …,

*N*. The copula

*C*exists for any distribution

*F*with marginals {

*F*}

_{i}_{i=1}

^{N}(cf. Nelsen, 2006, his Theorem 2.10.9). From Equation 5, it is clear that

*C*determines completely the intervariable dependence structure contained in the distribution

*F*in terms of the marginal distributions,

*F*.

_{i}An *N*-dimensional copula is equivalent to a distribution function on the *N*-dimensional unit hypercube [0,1]* ^{N}* with uniform marginals: define the random vector

**U**= (

*U*

_{1}, …,

*U*) where

_{N}*U*=

_{i}*F*(

_{i}*V̄*). The probability integral transform implies that each

_{i}*U*is a marginally uniformly distributed random variable (Gabbiani and Cox, 2010, their Sect. 11.8). The copula

_{i}*C*(

*u*

_{1}, …,

*u*) for

_{N}**V̄**has an equivalent definition as the distribution function of

**U**.

The “curse of dimensionality” prevents us from directly approximating the corresponding copula *C*. A common approach is to select a parameterized copula family which can then be fit via the maximum likelihood principle (Yan, 2007). We applied the Gaussian copula (Xue-Kun Song, 2000), which takes the following form:
Here **Φ**_{Σ} is the joint Gaussian distribution function with correlation matrix and Σ; (i.e., Σ* _{ii}* = 1 for each

*i*) and

**Φ**is the standard univariate Gaussian distribution function. Given independent identically distributed samples of a random vector

**V̄**= (

*V̄*

_{1}, …,

*V̄*)

_{N}*with marginals [*

^{T}*F*], the correlation Σ

_{i}*equals (Bouyé et al., 2000): Here corr(*

_{ij}*x*,

*y*) denotes the correlation coefficient of

*x*and

*y*(Xue-Kun Song, 2000).

Given a general copula distribution function *C* as defined in Equation 5, the copula density is defined as follows: *c*(*u*_{1},…,*u _{N}*) =

The joint density f corresponding to the distribution *F* may be written as follows:
where each *f _{i}* is the marginal density corresponding to the distribution

*F*. The density of the Gaussian copula may be expressed in closed form as follows: We fit a Gaussian copula to the transient response of the system,

_{i}**V̄**

_{Ax}

^{tr}(

*T*). To test the goodness of the fit distribution, we selected 20 random subsets of three left-side VS neurons, and associated with each subset a random stimulus rotation angle. We then sampled the marginal copula for each subset [i.e.,

*C*(

*u*,

_{i}*u*,

_{j}*u*) with

_{k}*u*=

_{i}*F*(

_{i}*V̄*) in Eq. 5], comparing it with the fit copula used in the MMSE calculations using Equations 6 and 7. We compared the empirically observed and fit copula distribution values at 1000 equally spaced points in the unit cube of the form (0.1

_{i}*i*, 0.1

*j*, 0.1

*k*), 1 ≤

*i*,

*j*,

*k*≤ 10. In Figure 4

*A*, we present a probability-probability (P–P) plot of the true (empirical) copula values against the fit values. In other words, letting

*C*

^{true}and

*C*

^{fit}indicate the true and fit copula distribution functions, Figure 4 presents a scatter plot of the 1000 points: We also computed for each of the 20 subsets and for all 1000 sample points the relative error between the probabilities from the true and fit copula distributions. In particular, we defined the following for each point: Because the probabilities lie within [0,1] by definition, the relative errors also lie within [0,1], with the value 0 indicating a perfect match. In Figure 4

*B*, we plot a histogram of the relative errors for all 20 random subsets (thus comprising a total of 20,000 data points). We found the true and fit copula distributions generally agreed quite well. The average relative error across all 20,000 points was ≈0.0438, and 90.3% of relative errors were <0.1.

##### Computation of the MMSE.

To obtain the MMSE (Kay, 1993), we first simulated the response of the VS network to determine an empirical estimate of the marginal distribution functions *F _{i}*. We did not assume a parametric form for the marginal distributions, but obtained a discrete estimate by binning values of the membrane potential integral at a sufficiently fine resolution. We then fit the Gaussian copula to the joint responses. Additional details are given in the last section of Materials and Methods.

Both the marginal distributions and the copula were determined as a function of the stimulus rotation angle θ_{stim} at a resolution of 1°. Marginal distribution histograms were approximated from 10,000 samples at each rotation angle, and the copula from 1000 samples. The MMSE was then computed based on 1600 samples taken at 5° increments.

The MMSE of the axis of rotation is calculated as the mean of the posterior distribution of the rotation vector given the axonal membrane potentials. To be precise, given an observed response **V̄**_{Ax}^{tr}(*T*), we associated each stimulus angle value θ_{stim} with a corresponding two-dimensional unit rotation vector **s**(θ_{stim}) = [cos(θ_{stim}), sin(θ_{stim})]* ^{T}*. The estimate, θ̂

_{stim}

^{MMSE}, was obtained by first computing an estimate of

**s**by averaging over the posterior distribution, i.e.: and then reporting the azimuthal angle of this estimate, θ̂

_{stim}

^{MMSE}= arg

**ŝ**

^{MMSE}, as done above for the OLE and zero-crossing estimators. Values of the posterior density

*P*(θ

_{stim}|

**V̄**

_{Ax}

^{tr}(

*T*)) were determined using the fit copula and the measured marginal distributions, along with Equations 8 and 9. The integral over the posterior density was calculated via simple Riemann integration at a 1° discretization.

In much of the Results we obtain the MMSE from a partial readout of the VS response. The estimation procedure is the same regardless of the size of the response vector **V̄**_{Ax}^{tr}(*T*). However, we will sometimes use the notation **V̄**_{Ax}^{subpop,tr}(*T*) to emphasize that an estimate is based on the response of a subpopulation.

##### An approximating Ornstein-Uhlenbeck model.

Assessing the generality of our results required determining whether our observations depended on the details of the model. In addition, we wished to isolate the essential features governing the modeling results. To do so, we derived a simplified model that shares the essential characteristics of the full model defined in Equation 1. The first change was to replace the time-dependent parameter *G*_{De}(*t*) by a constant, so that the simplified model became an Ornstein–Uhlenbeck (OU) process. Furthermore, we replaced the optic flow generated input by spatially correlated noise. However, cells retained the sinusoidal tuning curves of the full model. Finally, the correlations between the inputs to different cells decayed with the distance between them to capture the effect of overlapping RFs. In this simplified model, described by Equation 12, we again changed the magnitude of the diffusive coupling between the cells to examine its impact on encoding.

In form, the corresponding Langevin equations for the evolution of the coupled OU processes **X**_{Ax}, **X**_{De} are similar to those of the full model (Eq. 1) as follows:
The matrix *A*_{Ax} has entries given by
Thus, **X**_{Ax}, **X**_{De} are 10-dimensional processes, with a copy of the system for each hemisphere. The two copies are uncoupled, and differ parametrically only in their RF centers, as in the full model. We labeled the parameters so as to facilitate comparisons with their counterparts in the full model. For example, the parameter *a*_{gap} captures the coupling between neighboring compartments in the OU model, and corresponds to the parameter *g*_{gap} in the full model.

The tuning curves, **μ**(θ) in Equation 12, describe the steady-state response in the absence of fluctuations. To approximate the steady-state responses of the full system, we defined individual tuning curves as sinusoids, μ* _{i}*(θ) = sin(θ − ϕ

*), where ϕ*

_{i}*indicates the RF center of the*

_{i}*i′th*cell. Therefore, a cell will respond most strongly to stimulation angles orthogonal to the cell's RF center, as in the detailed VS model (Fig. 2

*E*). We again obtained estimators from the time integrals of the cell's responses. In Equation 12, the last term in the equation for

**X**

_{De}modeled the correlated input to the dendrites. The matrix

*B*was chosen so that correlations to neighboring dendritic compartments decayed exponentially with a space constant of 10°: if Δφ was the shorter angular distance between the RFs of two neurons, the correlations between the inputs to the two was exp( −Δφ/10°).

_{De}In Figure 12, we used the following parameters: τ_{Ax,OU} = 0.2, τ_{De,OU} = 40, *a*_{Ax-Den} = 1, σ_{Ax,OU} = σ_{De,OU} = 0.2. RF centers, ϕ* _{i}*, agreed with the full model. The window of integration for responses was

*T*= 10 units of dimensionless time. The coupling strength

*a*

_{gap}varied, and values used are indicated in the legend of Figure 12.

##### Generation of figures.

In all figures, to generate a single sample from the true distribution of the VS model axonal response we first used Equation 1 to model the response of the VS model to random optic flow stimuli. We then computed for each individual simulation the corresponding temporal average (Eqs. 3, 4) to obtain a single sample from the response distribution.

To compute the approximate minimum mean-square estimate used to generate Figure 10 and similar figures, we first had to approximate the joint distribution of **V̄**_{Ax}^{tr}, the transient averaged VS axonal responses. We did so in two steps: we first estimated the marginal distribution for each V̄_{Ax,i}^{tr} by generating samples as described above and binning at a sufficiently fine resolution. For checkerboard and bar images, we generally binned over the interval [−8 mV, 8 mV] at a resolution of 0.038 mV (420 bins). For natural scenes, where responses were weaker, we generally binned over the interval [−1 mV, 1 mV] at a resolution of 0.0048 mV (420 bins). We verified that bin sizes were small enough so that the results did not change with a further decrease in bin size (results not shown). We determined the marginal histograms at a 1° resolution, for each different time window considered. The histograms were approximated using 10,000 data points at each rotation angle. Given this large sample of realizations of the random vector **V̄*** _{i}*, the values

*F*(V̄

_{i}*) were approximated by applying a rank transformation (Berg, 2009).*

_{i}In the second step, we fit the Gaussian copula, also by generating samples directly from the model. The correlation matrix, which parameterizes the Gaussian copula, was determined using Equation 7. As with the histograms, we fit the copula at each angle at a 1° resolution and for each different time window considered. We fit the copula to 1000 samples generated independently from those used to fit the histograms.

With these approximations of the true histograms and maximum likelihood estimates of the parameters for the fit of the Gaussian copula, we drew 1600 new samples from the true model, at a spatial resolution of 5° for θ_{stim}. For each sample, we computed the approximate MMSE. By averaging across these samples at each value of θ_{stim}, we obtained estimates of the relation between θ_{stim} and the mean-square error of the MMSE, as plotted on the left of each panel in Figure 10 and similar figures. The averaged mean-square error shown in the bar plots on the right of each panel in Figure 10 was obtained by averaging these data over all values of θ_{stim}. The averaged mean-square error calculated for each subset in Figure 9 was determined likewise.

Given the intensive nature of the simulations involved in obtaining even a single data point in these figures, all computations had to be performed on supercomputer clusters. For instance, the sum of all computations performed to generate a single curve in the left panel of Figure 10*A* amounted to >2000 CPU hours on 2.2 GHz AMD Barcelona processors. The computer code necessary to implement the VS cell system is deposited in the ModelDB repository (accessible at http://senselab.med.yale.edu/modeldb).

## Results

Neurons of the VS in the fly respond to visual input, and encode information about the horizontal axis of ego-rotation (Fig. 1). This information is used by cells downstream from the VS system to control flight. We studied how the azimuth of the angle of body rotation is encoded in the response of the VS neuronal network using the model schematically depicted in Figure 2.

Presently, it is not known how downstream neurons read out information from the VS response. However, physiological evidence suggests that only a few VS cells form connections with specific downstream premotor neurons (Haag et al., 2007; Wertz et al., 2008, 2009a,b). We therefore explored the encoding of the axis of ego-rotation in the response of a subset of VS cells, by asking what is the best possible estimate of the rotation angle that can be obtained from a partial readout of the VS response? To answer this question, we computed the best estimate (the minimum mean-square estimate or “ideal estimate” from here on) that can be obtained from partial, subpopulation readouts. We found that electrical coupling substantially decreases the error of this estimate. We explain this observation by examining the coupling-induced changes in tuning curves, variability, and correlations of the VS cell population. Surprisingly, we found that coupling has no effect on estimates obtained from a full VS population readout, an observation we go on to explain mathematically. Finally, we validate the generality of these findings using a simplified and abstracted model of the VS network.

### Encoding of the azimuthal rotation angle in the VS response

We examined the response of the VS system to a variety of visual stimuli. To simulate the environment surrounding a fly, we first projected images onto the surface of a sphere centered at the fly. The sphere was rotated clockwise about a horizontal axis with azimuthal angle θ_{stim}, thus generating a pattern of optic flow (Fig. 2*A*). In each trial, we recorded the response of the model VS network to the optic flow generated from a single, random image. Across trials, we used different, randomly generated images: random bars, random checkerboards, and various compositions of natural scenes (see Materials and Methods). We show that, although there were quantitive differences in the VS responses, our results and conclusions hold for all image classes.

Based on our current knowledge of the fly visual system, the optic flow stimulus was filtered by an array of local motion (Reichardt) detectors (Reichardt, 1987; Borst et al., 2003; Fig. 2B). These detectors functionally separated the stimulus into upward and downward motion components which were summed according to the retinotopic RF of each neuron (Fig. 2*C*). These signals formed the input to the dendrites of the VS neurons and downward (respectively upward) motion elicited a transient increase in the excitatory (respectively inhibitory) conductance. The magnitude of this conductance change depended nonlinearly on the local rotational velocity. Figure 2*D* provides a schematic of the model.

In the fly, the VS neurons are arranged in a one-dimensional array with adjacent, ipsilateral cells coupled via axo-axonal gap junctions (Fig. 2*D*, far right). The steady-state response of the model network to downward motion in a narrow vertical strip is shown in Figure 2*E*. Input to one VS cell can impact the axonal response of other cells in the network because of coupling. Thus the “effective” axonal RFs of VS cells are much wider than the dendritic RFs displayed in Figure 2*C*. Hence, coupling allows the VS neurons to pool the responses of their neighbors (Cuntz et al., 2007; Elyada et al., 2009). Our goal was to understand the effect of such coupling on the encoding of the azimuth of the axis of ego-rotation in the VS population response.

#### Transient encoding of the axis of rotation in a VS subpopulation

During cruising flight in a stationary environment, flies often move along straight-line segments separated by saccadic periods of rapid rotation. These straight-flight segments occur at rates up to 10 per second and may be as short as 30 ms in duration (Schilstra and van Hateren, 1999; van Hateren and Schilstra, 1999; Boeddeker and Egelhaaf, 2005). Because motor projections of the VS network must pass through intermediate descending neurons (Strausfeld and Bassemir, 1985; Haag et al., 2007; Wertz et al., 2008, 2009a,b), the representation of ego-rotations for compensatory optomotor responses must take place at an even shorter timescale. Similarly, short timescales are likely critical during other natural flying behaviors, such as pursuit and tracking (Land and Collett, 1974; Collett and Land, 1975).

To understand the role of coupling in the VS network we examine its impact on the information available to downstream neurons that are involved in flight control. In particular, we focus on a pair of prominent premotor descending neurons that has been identified within each brain hemisphere. These descending neurons of the ocellar and vertical system (DNOVS) form gap junctions with subsets of the VS cells, and directly innervate motor neurons in the thoracic ganglion of the fly (Haag et al., 2007; Wertz et al., 2008, 2009a,b). DNOVS1 and DNOVS2 couple electrically to ipsilateral VS neurons, with the strongest coupling to the VS6–7 and VS5–6 neurons, respectively. The response of these downstream neurons is determined by temporal filtering of the graded response of the VS population rather than instantaneous values of their membrane potentials. We therefore considered time-averaged integrals of the transient response beginning from rest. We denote the vector of these averaged axonal responses by **V̄**_{Ax}^{tr}(*T*) (Eq. 3; Fig. 3), where *T* denotes the length of the integration window and the “tr” superscript indicates the transient state of the system.

The VS response encodes information about the ego-rotation axis parameterized by its azimuthal angle, θ_{stim}. We asked, “What is the best possible estimate of the rotation axis obtainable from the response of neurons that provide input to the DNOVS cells?” We therefore computed the estimator that minimizes mean-square error (the ideal estimate) based on the response of the VS5–7 neurons from each hemisphere, and report the corresponding azimuthal angle, θ̂_{stim}^{MMSE}, as described by Equation 11. Computation of θ̂_{stim}^{MMSE} relies on an average taken over the posterior distribution of the stimulus, given the subpopulation VS response, *P*(θ_{stim}|**V̄**_{Ax}^{subpop,tr}(*T*)). The vector **V̄**_{Ax}^{subpop,tr}(*T*) has six components corresponding to three cells, VS5–7, in two hemispheres. Later we consider readouts from larger subpopulations, and hence response vectors with more components.

Our model VS network is complex, and it is not obvious how to parametrize the likelihood *P*(θ_{stim}|**V̄**_{Ax}^{subpop,tr}(*T*)). As a result, Equation 11 cannot be evaluated via Markov Chain Monte Carlo methods, or other techniques designed for efficient sampling from probability distributions (Robert and Casella, 2004). We therefore simulated the model directly to estimate the likelihood, and then computed the integral in Equation 11 after using copulas to approximate the multivariate distribution of the responses (see Materials and Methods for details; Fig. 4 illustrates the quality of the copula approximation).

The ideal estimate gives the best possible estimate of the axis of rotation from the responses of a VS subpopulation, **V̄**_{Ax}^{subpop,tr}(*T*). Therefore, the mean squared error of any other estimate computed by a downstream population with access to the same VS subpopulation must equal or exceed that of the ideal estimate. Although we do not know how the output of the VS system is decoded, our results provide limits on the performance of any postsynaptic decoder.

#### Dynamical effects of coupling on VS network responses

Even in the absence of internal variability, the axis of rotation cannot be decoded perfectly from the VS response. Each visual scene consists of a different arrangement of edges and other features. Thus different scenes rotating about the same axis, result in different VS responses. If the parameter of interest, θ_{stim}, is fixed across a set of trials, we refer to the differences in the VS responses as overall trial-to-trial fluctuations. Such fluctuations can be due to variability in visual scenes (external variability), or noise generated internally, but the origin of overall trial-to-trial fluctuations is irrelevant when estimating the axis of rotation from a VS readout.

We illustrate this point in Figure 5 which shows the responses of the VS system to different random bar images. Each image contained the same number of bars of equal shape. However, their arrangement differed from image to image. Even in the absence of intrinsic fluctuations (i.e., σ_{Ax} = σ_{De} = 0), this resulted in different VS responses when the images were rotated about the same axis (Fig. 5*A*).

Electrical coupling between VS cells can significantly reduce such overall trial-to-trial fluctuations. Figure 5*A* shows the effect of coupling using five typical responses of the left-side VS10 neuron to rotations of random bar images about θ_{stim} = 90° with (left) and without (right) axo-axonal gap junction coupling.

We define the overall trial-to-trial covariability in the VS cell responses as the correlation coefficient computed over trials with different visual scenes, but fixed θ_{stim}. Figure 5*B* shows the sample (Pearson) correlation coefficient between pairs of VS membrane potentials averaged over the window labeled SS in Figure 5*A*. As expected, correlations increase with coupling. However, even in the absence of coupling, the overlap in the RFs of the different VS cells results in strong correlations between neighboring cells.

Figure 5*C*,*D* shows how steady-state tuning curves (mean responses as a function of θ_{stim} and response variability are affected by coupling. In addition to reducing variability, coupling allows the VS neurons to interpolate their responses (Cuntz et al., 2007; Elyada et al., 2009), and increase their orientation coverage (Graf et al., 2011): when coupled to its neighbors, each cell exhibits a graded response to an increased range of stimulus angles. This effect can also be observed by comparing the dendritic RFs in Figure 2*C* with the much broader axonal responses in Figure 2*E*. Notably, such smoothing takes place without a significant decrease in tuning curve amplitude.

### Effect of coupling on VS5–7 subpopulation encoding

What is the impact of coupling on the quality with which the VS5–7 neurons encodes the axis of rotation? The effects described in the previous section point to a potential trade-off: coupling could improve encoding by extending the range of tuning curves, and reducing response variability. On the other hand, increased correlations can decrease the fidelity of a parameter estimate. Whether and to what extent this is the case depends on the specifics of the system (Barlow, 1961; Panzeri et al., 1999; Averbeck et al., 2006; Shamir and Sompolinsky, 2006; Ecker et al., 2011; Latham and Roudi, 2011).

We examined the impact of coupling on estimating the rotation axis by computing its ideal estimate from a partial readout of the transient VS response to different images and a variable integration time. As shown in Figure 6, coupling greatly increases the accuracy of the estimate obtained from the responses of VS neurons 5–7. Even moderate levels of coupling resulted in a strongly reduced mean squared error of the estimate θ̂_{stim}^{MMSE} regardless of integration window and type of image presented (Fig. 6, bar charts). We will show that the situation is very different for a full population readout, where the mean-square error of the ideal estimate is largely independent of coupling.

### Factors determining partial decoding improvement

How does coupling improve decoding? To address this question, we next investigated whether improved estimation of the axis of rotation from the VS5–7 subpopulation is due to changes in individual cell responses. For instance, can changes to tuning curves and the reduction in variability observed in Figure 5*C*,*D* explain the improvements in encoding accuracy in Figure 6? Alternatively, can the improvements be explained by changes in the joint response of the VS neurons, such as increases in correlated variability (Fig. 5*B*)?

To obtain a full picture of the factors governing decoding performance, we also investigated how population-level encoding of the axis of rotation depends on the dimensionality of the response: how does a readout from multiple cells receiving distinct dendritic input compare with a readout from a single cell linearly integrating the same dendritic responses? Finally, we asked how the readout from different subpopulations is impacted by coupling.

#### Overall trial-to-trial covariability versus tuning curve smoothing

To examine the impact of overall trial-to-trial covariability on the error of the optimal subpopulation decoder, we shuffled the responses of each VS neuron across trials, separately for the uncoupled and coupled systems. Specifically, the decorrelated (shuffled) responses were obtained by drawing independently, for each fixed θ_{stim}, from the set of responses used in Figure 6. This allowed us to maintain the features of the single neuron response to the stimulus (the tuning curve shape), while removing overall trial-to-trial covariability. In the coupled case, overall trial-to-trial covariability includes both stimulus-induced and coupling-induced correlations, whereas in the uncoupled case it consists only in stimulus-induced correlations.

Removing overall trial-to-trial covariability produced a surrogate VS response distribution that differed from the true one. Using this “incorrect” distribution therefore also resulted in a different posterior distribution over the angle θ_{stim}, resulting in increased ideal estimate error. The magnitude of this increase tells us how important overall trial-to-trial correlations are for the ideal estimate.

As noted previously, coupling also smooths the tuning curves and increases coverage. We can compare the ideal estimate obtained in the absence of coupling without shuffling to the ideal estimate obtained with coupling and shuffling. Because overall trial-to-trial covariability is ignored in the second case, any reduction in error is due to changes in the responses of individual cells.

We present the mean-square error of the ideal estimate before and after shuffling in Figure 7. The left plot shows the error as a function of the axis of rotation for two coupling strengths, for trial-shuffled and the original data. The bar chart to the right presents the estimation error for the uncoupled and coupled systems, with and without trial-shuffling.

The response of the VS system is more strongly correlated in the presence of coupling. Therefore shuffling has a stronger effect on the response distribution, and performance of the ideal estimate suffers more when VS cells are coupled. In Figure 7 the difference between the dashed blue bar and the solid orange bar characterizes the impact of tuning curve smoothing due to coupling. The difference in performance between the solid and dashed orange bars characterizes the impact of overall trial-to-trial correlations. Thus, we see that approximately half of the improvement in mean-square error for the subpopulation readout was accounted for by the changes to tuning and the other half by correlated variability: *changes in the marginal statistics of the response and changes in the correlation structure were of nearly equal importance*.

#### Role of axonal filtering

The VS response is part of a hierarchy of signal processing steps, as shown in Figure 2*D*. In particular, the axonal system receives input from the dendritic system, which itself has a retinotopic RF structure, and also encodes the axis of rotation in its response. We have shown that coupling allows for a more accurate estimate from a subpopulation of the VS cells. However, if only a subset of the VS responses is used by an estimator, why are there 10 cells instead of a single cell with a spatially extended dendrite?

To answer this question, we simulated the system with only a single cell (axon) in each hemisphere which received input from all 10 dendrites, and left the dendritic structure unchanged. In this case, the axonal response within each hemisphere is univariate and evolves according to the following (compare with Eq. 1): This is equivalent to assuming that a single axonal compartment couples to a single dendritic compartment evolving as the average of the 10 dendritic compartments in the full model.

Figure 8 shows that replacing 10 cells by one has a strong negative impact on stimulus encoding. Here we compare the error of the ideal estimate for the single axon system to that of the full, coupled system with a partial readout consisting of the VS5–7 cells (Fig. 6). The average error of the ideal estimate in the full system is smaller by over a factor of four (Fig. 8, right, compare the solid orange and the dashed blue bars), and the error is reduced by up to a factor of eight at specific stimulation angles (Fig. 8, left).

Thus, multiple cells with differing tuning curves improve the performance of the ideal estimate, even when coupling increases overall trial-to-trial correlations between cells. The improvement is partly due to the additional dimensions available when reading out the VS5–7 responses, compared with a readout from a single compartment. However, this is not all: we compared the performance of the ideal estimate for the single axon system to that of the full, coupled system, with a readout from only the VS5 cell. The coupled system is again superior to the single axon system (Fig. 8, compare the dashed orange and the dashed blue lines). Hence averaging the responses of all dendritic inputs notably degrades ideal estimate performance. Although electrical coupling introduces correlations, responses of the individual VS cells are not identical. A balance between a coupling that allows for the integration of information in a subpopulation, and a distinct response between different VS cells seems best for stimulus encoding in a subpopulation. These results hold for passive dendrites; it is an open question whether a single axon system with active dendrites could perform better.

#### Partial readout subset size and VS cell identity

We asked to what extent our results depend on the particular subset of the VS population used to compute the ideal estimate. Would it matter if the DNOVS cells received input from a different subset of VS cells? To answer this question we randomly selected 20 distinct readouts of sizes up to five (except for readouts of size one, of which there are only 10 possible choices). For readouts from three cells, we ensured that the set VS5–7 (the readout considered in Fig. 6) was included. For each of the chosen readouts, we computed the mean-square error averaged across all values of θ_{stim}.

The results are presented in Figure 9. The mean-square error for the coupled network (Fig. 9*B*) decreases rapidly with the size of the readout subpopulation. Improvements are marginal beyond two cells. Strikingly, beyond one-cell readouts, there is very little dependence on the particular subset of VS cells on which the estimate is based. In contrast, the average mean-square error across readouts does not decrease as rapidly for the uncoupled network. In this case, the mean-square error also depends strongly on the identity of cells in the particular subset. Although the average mean-square error for the uncoupled and coupled network are close for readouts from three or more cells, the error for the “worst” readout is far larger in the uncoupled case.

Despite similarities in membrane and RF properties across the VS cells, we observed a large variance in the error depending on cell identity when reading out only a single neuron VS response, for both the uncoupled and coupled network. This variability can be attributed to the unequal coverage of stimulus space by the VS RFs, combined with the particular structure of interneuronal coupling. The large variability in the error for readouts from multiple uncoupled cells is due partly to the narrow tuning curves. Uncoupled neurons that are physically close do not communicated with other VS cells. Therefore, the collective “orientation coverage” of uncoupled neighboring cells is lower than for nonadjacent neurons. For such groups of neurons, a larger range of rotation axes will be poorly encoded. In the coupled system, the story is different: when considering a readout from at least two VS neurons from each hemisphere, the identity of the cells is not important. Coupling among VS neurons brings about a degree of “information democracy” in which any pair of neurons carries roughly equal information about the axis of rotation.

Overall, coupling was uniformly beneficial to the fidelity of encoding of the axis of rotation in the axonal response. Strikingly, however, the error for the readout of the VS5–7 cells was highest among all tested readouts for both the coupled and uncoupled system. Two key factors contributed to this result: first, single-cell readouts of each of these neurons yielded a larger error than single-cell readouts from other VS neurons (results not shown). In addition, these three cells are physically close, and thus lead to larger estimation errors. This phenomenon is closely linked to observations of changes in orientation coverage with population size recently reported in pools of orientation selective cortical neurons used to discriminate sinusoidal gratings drifting in different directions (Graf et al., 2011). Thus, if biological constraints dictate that the rotation angle be estimated primarily from the response of the VS5–7 neurons, coupling of the VS axonal responses becomes of great importance (Fig. 6).

Comparing the trends of the mean-square error in Figure 9 in the absence (Fig. 9*A*) and presence (Fig. 9*B*) of coupling suggests that coupling has a diminishing effect as subpopulation size increases. We next examine this observation further.

### Estimation of the axis of rotation from the full population response

We have shown that coupling significantly improves estimates from a partial readout of the VS response. How does coupling affect a readout from the entire VS population?

The performance of the ideal estimate for the angle of rotation estimated from the integrated axonal potentials of the full VS population is shown in Figure 10. The error of the ideal estimate depended on image statistics, but surprisingly, it was approximately independent of coupling strength.

As shown in Figure 5, coupling has a strong effect on correlations and tuning curves, and decreases the error of the ideal estimate obtained from a partial readout. It is therefore surprising that the same changes in tuning curves and correlations have no effect on the error of the ideal estimate obtained from the full population response. To understand this difference, we first apply the OLE to steady-state VS responses (see Materials and Methods). Although less general than the ideal estimate, the OLE is easier to analyze. We show that the insights obtained from the OLE in steady-state carry over to the ideal estimate both in steady-state and transient states.

#### The mean-square error of the OLE is independent of coupling in steady-state

We defined the steady-state response of the system of VS cells using averages of the graded responses of the VS population, **V̄**_{Ax}^{ss}(*T*) (Eq. 4). Here *T* indicates the time window of integration and the superscript “ss” indicates that the system is in steady-state. We then computed the OLE based on **V̄**_{Ax}^{ss}(*T*).

The mean-square error of the OLE as a function of θ_{stim} is shown in Figure 11, for three types of images and two integration time windows: across image types and integration time windows, the mean-square error of the OLE in steady-state was also independent of the strength of axo-axonal coupling. The statistics of the optic flow generated by images from a given class set the baseline level for the mean-square error. However, the error was independent of coupling strength for all classes of images tested. We also verified this for different parameters of the random bar images (results not shown; see Materials and Methods for details of image generation).

#### Explanation of coupling-independence

We first note that since the axonal and dendritic compartments were coupled electrically, the dendritic response was affected by the strength of the axo-axonal coupling. However, the impact of coupling on the vector **V**_{De}(*t*) of dendritic responses was limited to a multiplication of the response by a diagonal matrix (result not shown). Except for this scaling, the time course of the dendritic response is dominated by the synaptic input arriving through the visual pathway, which reflects the response to the filtered optic flow stimulus. Therefore, to a good approximation, the system may be viewed as hierarchical (Elyada et al., 2009): the motion detector-filtered optic flow stimulus drives the dendrites, and the dendrites drive the axonal compartments, with the activity at each step determined completely by the response at the preceding stage (along with any intrinsic noise sources).

As we show next, a consequence of this observation is that the vector of average axonal responses **V̄**_{Ax}^{ss}(*T*), is linearly related to the vector of average dendritic responses **V̄**_{De}^{ss}(*T*). The specific relation between the two vectors changes with coupling. However, because the relation is linear, the axonal and the dendritic response will give the same estimate of the axis of rotation, regardless of coupling. We next make these intuitive observations more precise.

Disregarding intrinsic noise in the system, we have
Here, _{D} indicates equality in distribution, and *H*(*s*) = *C*_{m}^{−1}*g*_{Ax-Den}exp[−*C*_{m}^{−1}*G*_{Ax}*s*] Θ(*s*) is the exponential filter the axonal system applies to the dendritic response. We note that at positive times, when the Heaviside function Θ(*s*) = 1, *H*(*s*) is a matrix exponential. That the axonal response can be represented by a convolution of a matrix exponential with the dendritic response (second equality), is a general mathematical property of linear systems of differential equations such as those which describe the evolution of **V**_{Ax} (Eq. 1).

The equality in distribution in Equation 14 arises from switching the order of integration and using the time-shift invariance of the dendritic membrane potential distribution under the steady-state assumption. In this case the distributions of **V**_{De}(*s* − *u*) and **V**_{De}(*s*) agree for all finite *u*, and the two quantities can be exchanged under an equality in distribution. The dendritic average **V̄**_{De}^{ss}(*T*) is defined analogously to the axonal quantity in Equation 4.

Equation 14 shows that, under the hierarchical assumption, the axonal activity is conditionally independent of the input given the dendritic activity: if the linear relation between axonal and dendritic responses is invertible, the posterior distribution of the stimulus given the axonal response agrees exactly with the distribution conditioned on the dendritic response. In other words, because **V̄**_{Ax}^{ss}(*T*) ≡_{D}**ℋV̄**_{De}^{ss}(*T*) for some invertible matrix ** H**, it follows that:
In this situation, there is no change in information about the rotation angle due to coupling. This equality holds as long as

**is invertible. Realistic gap junction coupling strengths change the entries in the matrix**

*H***, but do not impact its invertibility.**

*H*We found that the dendritic responses were independent of coupling up to an invertible linear scaling factor, implying that the posterior distribution *P*(θ_{stim}|**V̄**_{De}^{ss}(*T*)) is likewise independent of coupling. Thus, not only will the performance of the OLE be unaffected by coupling in this case, but the same conclusion holds for more general probabilistic estimators (including the ideal estimate we consider next, all Bayesian estimators and the maximum likelihood estimator).

#### Independence of the ideal estimate on coupling in the transient state

The explanation of why the ideal estimate error is independent of coupling in the transient state is largely identical to that provided in the steady-state case (Eq. 14), with one crucial difference: the equality in distribution on the third line of Equation 14 does not hold exactly for transient responses. A priori it is not obvious that the equality should hold even approximately. However, if the axonal responses are fast then the equality will hold to a good approximation: intuitively, a fast response means that the axonal filter, *H*(*u*) is not negligible only for *u* close to 0. Because the dendritic voltage changes are relatively slow, **V**_{De}(*s* − *u*) ≈ **V**_{De}(*s*) is nearly independent of *u*.

More precisely, the characteristic response timescales of the VS axonal compartments (and of the system filter *H*) are given as the product of the membrane capacitances with the inverses of the eigenvalues of *G*_{Ax.} If these eigenvalues are large enough, the equality in distribution in Equation 14 does hold nearly exactly because of a separation of the timescale of the axonal response from that of the output integration window. In this case, there again exists an invertible linear relationship between the transient axonal average **V̄**_{Ax}^{tr}(*T*) and the corresponding dendritic average **V̄**_{De}^{tr}(*T*).

The VS axonal compartments have baseline time constants on the order of a millisecond (Borst and Weber, 2011). The effective time constants will thus be even smaller, because of the axo-dendritic and axo-axonal gap junction synapses (Rudolph and Destexhe, 2003). These short integration time constants allow the system to reliably implement a linear transfer from the dendritic to the axonal averages for transient responses, as in Equation 14. To verify this, we performed linear regression analysis of the dependence between the axonal and dendritic responses. We found *R*^{2} values for individual coordinates were >0.999 for all coupling values (*g*_{gap} = 0, 0.5, 1 μS) and integration time windows tested (*T* = 10, 20 ms) when the intensity of intrinsic noise was set to zero (results not shown). This indicates that transfer is nearly linear in each axonal dimension. The linear regression was performed for random values of θ_{stim}, indicating the independence of this transfer matrix from the stimulus value.

In short, the axonal network uses a very fast system filter to institute a highly reliable linear transfer of the averaged transient dendritic response to the average transient axonal response. As in the steady-state case, the entries of the transfer matrix depend on coupling, but its existence and invertibility do not. Hence, the posterior distribution of the stimulus conditioned on the axonal response is nearly identical to that conditioned on the dendritic response (Eq. 15), and the latter is approximately independent of coupling (Elyada et al., 2009). As a result, the estimates (and error) of any probabilistic estimators will not depend significantly on the strength of the axo-axonal gap junction coupling within the VS system.

#### Coupling improves partial decoding in a simplified model

We next asked whether our conclusions depended on the details of the model fly visual system we used in our simulations, the statistics of the images presented, or other choices we made in our analysis. For this purpose, we examined whether similar conclusions can be obtained in a simplified, analytically tractable approximation of the full VS model (see Materials and Methods). In this simple OU system we discarded the Reichardt detectors, and the complex visual input used in the full model. Instead, the input to the system was white noise, which was correlated in space to model VS cells' RF overlap.

The 10 dendritic and 10 axonal compartments were modeled by linear differential equations which shared two essential characteristics with the full model. First, we assigned the dendritic compartments tuning curves (i.e., mean responses in the absence of fluctuations) which were sinusoidal functions of the stimulus angle to emulate the retinotopic response properties of the VS neurons. Second, the axonal compartments were diffusively coupled.

In this model, we emulated the relatively slow timescale of the dendritic input arriving to the axon terminals, by assigning the dendritic compartments a time constant which was an order of magnitude larger than that of the axonal compartments. Details of the model are given in Materials and Methods.

In this setting, we qualitatively replicated our earlier findings: the performance of the ideal estimate obtained from a partial readout of the axonal responses increased with coupling. With a readout from the entire population, performance was independent of coupling (Fig. 12). We did not tune the model, and the result held over a wide range for all parameters (results not shown).

## Discussion

Organisms need to rapidly extract information about ego-motion from complex patterns of optic flow (Fig. 1*C*). We used a simplified, but biophysically realistic model of the fly VS to examine the role of electrical coupling in encoding the azimuthal axis of ego-rotation. We have shown that this parameter can be quickly and accurately extracted from the transient response of a VS subnetwork.

The impact of coupling and correlations on encoding in neuronal populations has been studied extensively (for review, see Averbeck et al., 2006). Interestingly, we found that coupling did not affect the error of optimal estimates from complete population responses. The posterior distribution over the azimuthal angle of rotation was unaffected by coupling. Hence, Bayesian or maximum-likelihood estimators were similarly unaffected.

Physiological evidence suggests that part of the VS network drives the response of two of its postsynaptic neurons. The DNOVS 1 and 2 are efferent targets of VS cells projecting to thoracic ganglia where they contact neck motor neurons involved in head stabilization during flight (Haag et al., 2007; Wertz et al., 2008, 2009a). Each DNOVS neuron is most strongly coupled to two VS cells, and the DNOVS pair is coupled predominantly to a subset of three VS neurons in each hemisphere.

Coupling between VS cells significantly improved encoding accuracy of the rotation angle in the response of this VS subpopulation. For this partial readout, the transfer from dendritic to axonal membrane potential is approximately linear, but not invertible. Hence, the performance of the estimator depends on coupling. The dynamical changes induced by coupling resulted in a subpopulation readout that could be as accurate as a full population readout. Coupling had the greatest impact on the readout from VS5–7 and postdendritic processing enabled by the coupling of distinct axonal compartments was crucial to encoding accuracy. These results are robust and general: model details, such as image features and integration timescales, had only a quantitative impact.

Based on these results, we can make concrete predictions that can be tested experimentally. These predictions rely on the ability to ablate a single VS cell that does not provide direct input to DNOVS neurons, or block its gap junctions. Under such conditions, the decoding of rotation axes from postsynaptic DNOVS neurons should be differentially affected, depending on which VS cells the rotations preferentially stimulate. If such specific silencing of a VS neuron were possible in the intact animal, we predict a similar differential effect on behaviors controlled by DNOVS neurons.

The fly visual system is an established model for the study of optimal motion encoding and decoding (Laughlin, 1981; Bialek et al., 1991; van Hateren, 1992; Gabbiani and Koch, 1996; de Ruyter van Steveninck and Laughlin, 1996; Fairhall et al., 2001). To the best of our knowledge, our work is the first to examine the impact of electrical coupling on optimal population coding by using a mathematical and computational analysis of the response of a detailed model of the VS network (Borst and Weber, 2011). Previous arguments about the benefit of coupling were primarily heuristic (Elyada et al., 2009), and generally concerned steady-state responses (Cuntz et al., 2007; Weber et al., 2008). By considering dominant eigenmodes, Weber et al. (2008) showed that coupling leads to a reliable, lower dimensional representation of VS activity. Rotation about a given azimuthal axis results in depolarization of the VS cells located to one side of the axis, and hyperpolarization in the cells located on the opposite side. This prompted Cuntz et al. (2007) and Elyada et al. (2009) to propose estimating the axis of rotation by interpolating the responses of the VS cells, and finding a zero-crossing in the mapping between these responses and the VS cells' preferred axes of rotation. Coupling reduces response variability, and hence the error of this estimate.

When we implemented such a zero-crossing estimator (see Materials and Methods) and compared it to the ideal estimate, we found it to be suboptimal. The estimator usually struggled to achieve reasonable encoding accuracy when presented with natural scenes. Responses to natural scenes were weaker than those induced by the other image types we considered, and the zero-crossing estimator is very susceptible to noise. Furthermore, coupling could either improve or degrade the estimates, depending on image statistics. The zero crossing estimator from a subpopulation response performed even worse, producing errors several times that of the ideal estimate.

Furthermore, it is unclear how the zero-crossing and other proposed suboptimal estimators could be implemented downstream from the VS cells. In contrast, the ideal estimate establishes a baseline for the performance of any estimator. Although suboptimal estimators may be affected by coupling in different ways, we found consistent results for the ideal estimate across stimulus conditions. Although neuronal networks may not process information optimally (Loeb, 2012), evidence for approximate optimality exists both at the behavioral and neural level (Ernst and Banks, 2002; Fetsch et al., 2012). We used optimal estimators as an operational benchmark, revealing the potential capabilities of the system under realistic assumptions.

Two studies are conceptually close to our work: Karmeier et al. (2005) took a Bayesian approach to quantify the encoding efficiency of the axis of rotation in the VS population response. They also proposed time integrals of the VS membrane potentials as readout variables, and examined the impact of population size on encoding in the VS population. We note several important differences: Karmeier et al. (2005) did not investigate the effect of VS coupling, instead focusing on the effects of integration time and input correlations. Furthermore, they used a phenomenological model, in contrast to our biophysically plausible model. More recently, Weber et al. (2012) applied generalized linear models to assess the benefits of coupling between two optic flow-processing, spiking neurons of the lobula plate (H1 and Vi) for conveying information about optic flow parameters.

We found that changes in correlation structure were important in improving encoding accuracy in the case of a subpopulation readout (Fig. 7). These findings contrast with typical arguments about the benefit and harm of trial-to-trial correlations (Tkačik et al., 2010). Correlations between VS neurons carry information about the responses of unobserved neurons. Electrical synapses are both strong and fast in their effect on subthreshold dynamics relative to their chemical counterparts (Xiao et al., 2013). They are therefore well suited for increasing the coverage of a parameter, reducing variability, and introducing correlations.

Many previous theoretical studies examined how changes in neuronal response statistics, such as correlations or tuning curves, impact coding (Barlow, 1961; Sompolinsky et al., 2001; Seriès et al., 2004; Averbeck et al., 2006; Josić et al., 2009; Ecker et al., 2011; Latham and Roudi, 2011). This can give valuable insights into how coding is affected by aspects of the neural population response. However, correlations and tuning curves are not intrinsic properties of a population response that can be changed arbitrarily (Shea-Brown et al., 2008; Beck et al., 2012). To examine how the statistics of neuronal activity affect coding, it is therefore important to consider realistic networks with realistic inputs.

A key advantage of our approach is that we made no a priori assumptions about the VS population response. In particular, we made no assumptions about how the joint activity of VS neurons encodes the axis of rotation. Rather, we considered the responses of a biophysically realistic model to various stimuli. The spatiotemporal structure of the input, and the properties of the VS network fully determined its responses, allowing us to characterize the best estimate of the stimulus available to the animal (Graf et al., 2011).

Our results open a number of avenues for future research: we used temporal averages as a readout of the VS population responses. However, downstream DNOVS neurons are not perfect integrators. DNOVS 2 is a spiking neuron, introducing a strong nonlinearity into the processing pathway. Further, the effect of interactions with other neurons of the lobula plate should be investigated (Borst and Weber, 2011). We also did not attempt to examine the impact of correlations beyond second order. Application of maximum-entropy approaches could help address this topic (Jaynes, 1957).

The aerial performance of flies is unmatched in nature and technology (Frye and Dickinson, 2001). Understanding how a small set of neurons in the fly removes irrelevant variability to extract behaviorally relevant information can provide insight into the implementations in more complex organisms.

Our results in the fly VS network are suggestive of a general principle: coupling between neurons allows for near-optimal readouts from a subpopulation. Correlations between the responses introduced by coupling can lead to information democracy, where even small subpopulations carry nearly as much information about the stimulus as the entire population. In a neuronal network that encodes a behaviorally relevant parameter, coupling can thus allow each neuron to represent a greater extent of the parameter space. When downstream targets extract information about this parameter from relatively few neuronal projections, correlations between responses can be highly beneficial (Stevenson et al., 2012). Although details of our study are particular to the fly visual system, the main ideas are likely to apply across organisms and modalities.

## Footnotes

This work was supported by NSF Grants DMS-0817649, DMS-1122094, a Texas ARP/ATP award to K.J., and a Humboldt Research Award, NSF Grant DMS 1120952, and NIH Grant MH065339 to F.G. We thank Drs Alexander Borst and Yishai Elyada for generously providing their computer code for the VS network model, Elisabeth Hopp for comments on the paper, and Edgar Gabriel (Department of Computer Science at the University of Houston) and the Texas Learning and Computation Center for access to computer clusters.

The authors declare no competing financial interests.

- Correspondence should be addressed to Dr Krešimir Josić, University of Houston, Department of Mathematics, 651 PGH Building, Houston, TX 77204. josic{at}math.uh.edu