Abstract
Motion estimation is crucial for aerial animals such as the fly, which perform fast and complex maneuvers while flying through a 3-D environment. Motion-sensitive neurons in the lobula plate, a part of the visual brain, of the fly have been studied extensively for their specialized role in motion encoding. However, the visual stimuli used in such studies are typically highly simplified, often move in restricted ways, and do not represent the complexities of optic flow generated during actual flight. Here, we use combined rotations about different axes to study how H1, a wide-field motion-sensitive neuron, encodes preferred yaw motion in the presence of stimuli not aligned with its preferred direction. Our approach is an extension of “white noise” methods, providing a framework that is readily adaptable to quantitative studies into the coding of mixed dynamic stimuli in other systems. We find that the presence of a roll or pitch (“distractor”) stimulus reduces information transmitted by H1 about yaw, with the amount of this reduction depending on the variance of the distractor. Spike generation is influenced by features of both yaw and the distractor, where the degree of influence is determined by their relative strengths. Certain distractor features may induce bidirectional responses, which are indicative of an imbalance between global excitation and inhibition resulting from complex optic flow. Further, the response is shaped by the dynamics of the combined stimulus. Our results provide intuition for plausible strategies involved in efficient coding of preferred motion from complex stimuli having multiple motion components.
Introduction
Flies often perform complex maneuvers while navigating a 3-D environment (Wagner, 1986; Fry et al., 2003), which results in a wide range of angular velocities. Behavioral studies (Collett and Land, 1975; Budick et al., 2007) have shown that such maneuvers also lead to complex optic flow with rotational components mixed in a way that is hard to disambiguate. Wide-field motion-sensitive neurons in the lobula plate region of the fly's brain have been known to respond selectively to motion along different directions (Hausen, 1981, 1982a,b; Hengstenberg et al., 1982). Anatomical studies have identified ∼60 different motion-sensitive neurons in the lobula plate (Hausen, 1981, 1982a; Hengstenberg et al., 1982; Borst and Haag, 2002), from which only a small subset of neurons, namely H1, V1, HS, and VS, have been extensively studied, owing to their ease of detection and robustness of response.
Much of the classic work on visual motion detection in insects and vertebrates employed simple 1-D spatiotemporal stimuli such as bar, grating, and sinusoidal patterns moving along a particular direction, usually along the preferred direction of the neuron (Hassenstein and Reichardt, 1956; Hausen, 1982b; van Santen and Sperling, 1984). Despite their practical advantages, these simplified stimuli fail to capture essential elements of a complex image flow generated during flight, such as rotations about the three principal body axes, and translation. These motion elements have been demonstrated in the flights of not only blowflies, but also of bees and wasps (Srinivasan et al., 1991; Hateren and Schilstra, 1999; Schilstra and van Hateren, 1999; Zeil et al., 2008). The receptive field structures of LPTCs further suggest sensitivity to movement along other directions besides the preferred direction (Krapp et al., 2001). It is thus conceivable that motion encoding is affected by the presence of additional rotational degrees of freedom. The geometry of spatial patterns (Borst et al., 1993; Saleem et al., 2012) and velocity components from other rotation types (Kern et al., 2005, 2006) also affect the response of motion-sensitive neurons. However, a quantitative study of preferred motion encoding from a complex multistimulus input in the context where a stimulus is not aligned to the preferred direction of the neuron has so far been lacking.
Here, we investigate the effect of additional wide-field components on the encoding of yaw by H1, a motion-sensitive neuron in the lobula plate of the fly. The stimulus consists of a 2-D pattern subjected to combined rotation about two axes. Combining techniques of information theory and dimensional reduction, we quantify the reduction in H1's yaw encoding efficiency in the presence of distractor and show that H1 also encodes certain distractor features; the latter depending on the global excitation and inhibition from the optic flow. We further show that the neural response matches the collective dynamic range of the input, which indicates that response scaling is preserved for these multiple motion components.
Materials and Methods
Fly preparation.
Wild-caught blowflies, Calliphora vicina, were housed in an enclosed cabinet that was maintained at a temperature of 21°C and a relative humidity of 55%, and was illuminated with a 12 h light/dark cycle. Experiments were conducted primarily in adult female flies, and consistent results were obtained from six flies. We also checked for consistency of results against male blowflies. The fly was placed in a cylindrical tube, its wings and legs restrained using dental wax. Its head was immobilized by wax bridges from the genae (“cheeks”) to the rim of the tube, leaving the proboscis free so that the fly could be fed in between experiments. The spiracles were kept free to ensure proper respiratory intake during the experiment. After opening the rear of the head capsule, dorsolateral muscles and air sacs covering the lobula plate were removed. The fly holder was then transferred to a goniometer platform near the display screen. Room temperature was maintained between 19°C and 20°C and was monitored throughout the experiment.
Stimuli.
A spatial pattern was created generating a 2-D random matrix on a square 0.1° × 0.1° grid, filtering these data with a 2-D spatial Gaussian of SD σ = 6° and thresholding the resulting numbers around their mean to produce a set of binary intensity values. Subsequently, the image was linearly filtered to prevent spatial aliasing during stimulus presentation. The temporal stimuli consist of velocity waveforms corresponding to yaw, pitch, and roll, each of which is drawn from independent Gaussian distributions. Because H1 is primarily sensitive to horizontal motion, roll and pitch motion will generically be referred to as “distractor” motion. A wide range of SD values for roll, σR = {100, 300, 1000, 3000, 5000, 7000, 10000}°/s, and pitch, σP = {10, 30, 100, 300, 500, 700, 1000}°/s, were chosen. The SD of yaw motion (σY) was kept unchanged at 100°/s for all yaw–distractor SD pairs. The relative strength of roll and pitch to yaw is defined as the ratio of their SDs (i.e., ξR = σR/σY and ξP = σP/σY, respectively). For each value of ξR and ξP, six different motion variants were presented sequentially in a trial (Fig. 1C), with each segment lasting 3 s. Segments 1, 2, and 6 consist of repeated yaw with nonrepeated distractor motion, while segments 3, 4, and 5 consist of yaw and distractor motions both of which were nonrepeated across trials. A total of 210 trials were used for each experiment.
The spatial map and the velocity waveforms for yaw and distractor degrees of freedom. A, Spatial map is created by smoothing a matrix of random values with a Gaussian of σ = 6°. The spatial extension is 360° along the x- and y-axes with a grid resolution of 0.1°. Contrast was set to maximum. B, Snapshot of an image seen in the TEK 608 monitor. Pixel intensities are interpolated for successive images displayed on the screen at 500 Hz frame rate. C, D, Yaw velocity traces (blue) and distractor (roll or pitch) velocity traces (red) within a 200 ms window for six segments, each lasting 3 s. The velocities of each of the example traces are drawn from Gaussian distributions with σ = {0, 100}°/s. “Rep” and “Non-rep” stand for velocity traces that are repeated and nonrepeated, respectively, across trials. A total of 200 trials are used for analysis. The solid dark and solid pale lines represent velocity traces in the same segment from two different trials.
We choose different ranges for ξR and ξP because roll motion, which is generated about a point of origin, introduces local velocity components that are different from those generated by pitch motion, which acts in the vertical direction. This serves two purposes. First, we obtain a reasonably wide range of distractor noise over which we can test the efficiency of H1. With the given size of the visual field (Fig. 1B), the tangential component induced by a 10,000°/s roll has a velocity of ∼3000°/s at the edge of the field. Tangential components at points closer to the center are smaller. Thus, the tangential velocity components of roll can effectively reach up to an order of 10 times the yaw velocity (σY = 100°/s). Since the maximum value of σP is 1000°/s, pitch velocities also can reach up to an order of 10 times the yaw velocity. Similarly, velocity components of roll and pitch at the lower limits of σR and σP are ∼10 times lower than that for yaw. Second, the velocities lie in the neighborhood of those generated during actual flight, though one must recognize the tradeoff between choosing a wide range of parameter values and the ecological relevance of the values. Behavioral studies suggest that the angular velocity of the head of the fly reaches up to ∼2500°/s for yaw, ∼1200°/s for roll, and <1000°/s for pitch during saccades. The velocities during nonsaccades are typically ≤100°/s (Wagner, 1986; Hateren and Schilstra, 1999; Kern et al., 2005). Therefore, the velocities induced by roll and pitch used in our experiments approximately overlap with the range of angular velocities accessed during saccadic flights. Since the retinal image speed is on the order of 100°/s for nonsaccadic flights, which account for ∼60% of the total flight time (Hateren and Schilstra, 1999), we choose a value of σY that is of the same order. This allows us to study yaw encoding in the presence of fluctuations ranging from nonsaccadic to saccadic limits.
Euler rotations in a 3-D coordinate system do not commute. In simple terms, this means that the collective rotation of two successive rotations about two different axes depends on the order in which the rotations are performed. However, in our stimuli, yaw, roll, and pitch rotations are small from one frame to the next, such that the rotations commute to a good approximation. Therefore, the order in which rotations are combined does not matter. Also, the rotations are displayed on a 2-D screen, such that in a flattened approximation, homogeneous movements along the horizontal and vertical axes of the screen are equivalent to yaw and pitch motion, respectively. Roll motion, resulting from rotation about the longitudinal axis, is always produced about the center of the screen.
The stimulus was presented on a Tektronix 608 monitor at a 500 Hz frame rate and a radiance of 150 mW · sr−1 · m−2 (sr: steradian), which corresponds to an estimated 5 × 104 transduced photons per second per photoreceptor. The spatial pattern was displayed on a hexagonal raster of 833 pixels with 38° horizontal and 44° vertical extents (Fig. 1B), which subtends ∼8% of the solid angle accessed by the eye of the fly (Krapp and Hengstenberg, 1997; Lewen et al., 2001). Because the pixels were arranged in a hexagonal array, the intensity of each pixel was calculated as the linear weighted average of intensities of four nearest-neighbor pixels. The fly was positioned at 8.4 cm from the screen, such that the projected interpixel separation approximately matches the interommatidial distance of 1.5° in the blowfly (Land, 1997). The match is, however, approximate because pixels were spatially blurred to prevent aliasing. This makes the experiment insensitive to the misalignment between the raster of the eye of the fly and the pixel-sampling raster. The fly watched the center of the display screen at an azimuth angle of 30°. The stimulus files were precomputed using MathWorks Matlab version 7.4. A custom-written C program was used to control a National Instruments PCI-6259 data acquisition card that processed the conversion of stimulus data to the analog voltage displayed on the screen.
Electrophysiology.
Tungsten microelectrodes (1 MΩ; tip diameter, ∼5 μm; FHC) were used to make differential single-unit recordings from H1 in the contralateral lobula plate in the right half of the head. H1 was identified by its characteristic excitatory response to horizontal back-to-front motion and its suppressive response to horizontal front-to-back motion. Control experiments using bar patterns of a 7° wavelength were conducted to adjust elevation and twist, such that the response of H1 is the maximum for horizontal motion and the minimum for vertical motion. Care was taken to achieve distinct isolated spikes with a high signal-to-noise ratio of at least 6. The voltage difference between the recording and reference electrodes was filtered, amplified, and thresholded by a window discriminator to generate discrete pulses. The pulses were time stamped with precision at 10 μs intervals by the National Instruments data acquisition card that generates the analog output. This guaranteed that the displayed stimuli and recorded spikes were synchronized to the same board clock. During the course of the experiment, the drift in spike amplitude was <30% and the amplitude remained well above the spike-discriminating threshold set before the start of the experiment.
Spike train information.
Noise, both internal and external to the system (de Ruyter van Steveninck and Bialek, 1995; Manwani and Koch, 1999; White et al., 2000), limits the information transmitted by a spike train. The presence of a distractor effectively adds another “noise” component, which further limits the transmitted information. We quantify this by estimating the information transmitted about a repeated yaw motion in the presence of a nonrepeated distractor motion across multiple trials (Fig. 1C,D, segments 1,2; de Ruyter van Steveninck et al., 1997). To remove the effect of initial transients, we discard data from the first 10 trials of each experiment.
To estimate entropy of the spike train, spikes and spike-absent events within a time bin Δt = 1 ms were assigned values 1 and 0, respectively. This guaranteed a maximum of one spike in each time bin. A distribution of binary values or “words,” P(W), was obtained from the sequence of 1 and 0 s lying within each time window, T ≥ Δt (de Ruyter van Steveninck et al., 1997). The entropy rate for infinite word length, or T → ∞, was obtained from linear extrapolation of the flattened part of the R versus 1/T curve. For a time window T, the total entropy rate of the spike train is obtained from P(W) as follows:
The noise entropy rate is obtained by averaging the distribution of words obtained at each time instance P(W|t), over the time instances in a trial, as follows:
The rate of yaw information transmitted by the spike train in the presence of the distractor is given by the difference between the total entropy rate and the average noise entropy rate, as follows:
Information per spike can be obtained either by setting T = Δt and dividing by 〈r〉 in Equation 3, or by using the time-varying firing rate r(t) (Brenner et al., 2000b), as follows:
Here, Ttrial is the time period of repeated stimulus. To estimate the information carried by the spike train about repeated distractor motion in the presence of nonrepeated yaw, we use the same method of analysis as above, but with data from the sixth segment of the trial (Fig. 1C,D).
Error estimation.
It is well known that direct estimates of the entropy rate (RTot) and the noise entropy rate (RNoise) are biased due to undersampling, and that this bias becomes more prominent at larger values of T. (Treves and Panzeri, 1995; Panzeri et al., 2007). With the assumption that spike correlations are finite, the true entropy rate can be estimated by taking the limiting conditions that data size → ∞ and T → ∞ (Treves and Panzeri, 1995; Strong et al., 1998), as follows:
Linear extrapolation is performed separately for the curves RNoise and RTot, using only the time windows in which each curve satisfies linearity. The information rate (RInfo) is obtained by subtracting the extrapolated RNoise from the extrapolated RTot.
To obtain statistical error bounds for entropy estimates, we used bootstrap analysis (Efron and Tibshirani, 1993). From the 200 trials, an ensemble of 100 datasets (bootstrap samples), each containing 200 trials, was obtained by resampling. The SD of entropy for different values of T was obtained from the bootstrap samples and was used to minimize χ2 (Bevington and Robinson, 2003; i.e., the least-square error between estimates from the bootstrap and the parent distribution). The slope of the linear extrapolation is determined from the best-fit curve with a 95% confidence interval.
Reverse correlation methodology.
To extract the set of the most significant stimulus features associated with spikes, we use reverse correlation analysis (de Boer and Kuyper, 1968), which was extended to second order by sampling spike-triggered covariances (STCs; de Ruyter van Steveninck and Bialek, 1988). This allows the extraction of a low-dimensional stimulus space, which is spanned by a set of orthogonal eigenvectors associated with leading eigenvalues of the covariance matrix. Since the stimuli consist of combined yaw–distractor motion, both yaw and distractor stimuli are used in the analysis.
The spike time resolution in our analysis is 1 ms, but the stimulus velocity is sampled at 2 ms, corresponding to the frame rate of 500 Hz. In all results presented in this work, the yaw (Y) and distractor (D) velocities are scaled to their respective SDs, σY and σD. The spike-triggered average (STA) is obtained by averaging 100 ms time windows (τ) of stimulus fluctuations (s⃗0) preceding each spike, over all the N spikes, as follows:
where tn represents the time of nth spike. The STC is calculated by using the concatenated spike-triggered yaw and distractor stimuli with their STAs subtracted (de Ruyter van Steveninck and Bialek, 1988), as follows:
where i, j ϵ [1, 100] and s⃗Y, s⃗D represent stimuli with the mean subtracted normalized to SD. The prior covariance is obtained by averaging over all time bins in the experiment, as follows:
Eigenvalues and eigenvectors are then calculated from the difference between the spike-triggered and prior covariance matrices, as follows:
The off-diagonal blocks depicted in Equation 12 represent yaw–distractor cross-covariances YD and DY, while the diagonal blocks represent the yaw and distractor autocovariances YY and DD.
Significance testing.
Although the symmetry of ΔC guarantees that the eigenvalues (λ) are real, they can be positive or negative, depending on an increase or decrease in variance along the eigenvector directions. To extract the set of most significant eigenvalues, we used a radius of threshold determined from the Wigner distribution (Wigner, 1958). The eigenvalue spectrum is parsed into a relevant set and an irrelevant set that embody the correlations and random noise in ΔC, respectively, such that the eigenvalue distribution of the irrelevant set resembles the distribution of eigenvalues of a symmetric random matrix (CRnd) with no element-wise correlations. In the limit of infinite data, this distribution converges to the Wigner semicircle, as follows:
where M is the dimensionality of the covariance matrix, σM2 is the variance of matrix elements of CRnd, and R = 2
We implement this method by first generating CRnd from N randomly chosen prior stimulus time windows, where N represents the total number of spikes. We then determine the radius (R) of the semicircle from a nonlinear least-squares fit of the irrelevant eigenvalues of ΔC to the theoretical cumulative Wigner distribution (Eq. 15). We can also formulate R analytically. Each off-diagonal element of CRnd is the product of two independent random variables with an SD of 1, averaged over N spikes. Therefore, the SD of the elements of CRnd is 1/
Input–output function.
Bayes' theorem allows us to map our prior stimulus knowledge to the spiking response of H1. The functional form of this map is represented by the input–output function (Brenner et al., 2000a), as follows:
where P(s̃1,…,s̃K) and P(s̃1,…,s̃K|spk) represent the prior and the posterior distribution of stimulus projections (s̃k) on the relevant eigenvectors (êk). The normalized firing rate shown in Equation 17 describes the input–output relationship.
Yaw–distractor subspace information.
The information per spike associated with a relevant dimension can be found by calculating the entropy difference between distributions of spike conditional stimulus projections and prior stimulus projections. For a K-dimensional subspace, this is given by the following:
Note that this formulation (Eq. 18) is the same as that used by DeWeese and Meister (1999), who define it as “stimulus-specific” information. According to this definition, the information gained from the observation of a symbol can be positive or negative. Recall that positive eigenvalues are associated with the broadening of distribution, which leads to a posterior entropy that is higher than the prior, and consequently negative, symbol information. Similarly for negative eigenvalues, the symbol information is positive. Therefore, choosing this formulation instead of the alternative “surprise” (DeWeese and Meister, 1999) allows us to make a connection between symbol information and the expansion or compression of the stimulus space.
The information associated with nonoccurrence of spikes can be formulated as follows:
Though a symbol can carry negative information, the information averaged over the symbols or, mutual information (MI) is still positive, as follows:
We perform this information-theoretic analysis on the yaw subspace, the distractor subspace, and the combined yaw–distractor subspace.
Results
Firing pattern modulation
H1 is responsive to yaw motion, and the peristimulus time histogram (PSTH) for repeated presentations of the same pseudorandom yaw waveform therefore shows strong variations in firing rate, often with sharp onsets and offsets. The addition of random nonrepeated distractor motion to the original yaw stimulus will tend to wash out the structure of such yaw-induced rate fluctuations, lowering the information that H1 carries about yaw. This is illustrated in Figure 2 for both roll and pitch distractors at ξR = σR/σY = 10 and ξP = σP/σY = 3, respectively. For the pure yaw case, the spike raster locks on to preferred yaw motion but shows suppression to nonpreferred yaw motion (Fig. 2A). From the modulation in firing rate, we can compute the information per spike according to Equation 4, and for the conditions of the experiment shown in Figure 2, we get I(spk) = 1.82 bits. The spike rasters for yaw–roll and yaw–pitch combinations show similar locking behavior but have increased variability compared with the pure yaw case. This increase in variability results in broadening of the PSTH peaks (Fig. 2B) and a decrease of single-spike information. Conversely, regions with very low activity experience an increase in firing rate (Fig. 2B, insets). As an example of the spread in PSTH peaks, we fit a Gaussian to an isolated peak in the PSTH (Fig. 2C) over the region where its shape is preserved. The SD of the Gaussian increases from 4 ms with pure yaw, to 14 ms with combined yaw and roll, and to 8 ms with combined yaw and pitch. The effect of the distractor on firing rate is relatively small at intermediate values of ξR and ξP. For example, the mean firing rate (r̄) changes from 35 spikes/s without the distractor to 31 spikes/s at ξR = 10 and 34 spikes/s at ξP = 3. The effect is much larger at higher values of ξ, especially for roll, when r̄ decreases to ∼25 spikes/s at ξR = 100.
Impact of distractor on spiking response to repeated yaw stimulus. A, A 60-trial spike raster over a 350-ms-long time window, from pure yaw at σY = 100°/s, combined yaw-roll at ξR = 10, and combined yaw-pitch at ξP = 3. B, PSTH (2 ms bins) shows response averaged over 200 trials. C, Gaussian fits to isolated peaks in PSTH to determine spiking precision are shown for the three cases. The stimuli corresponding to these responses are shown in segment 1 and segment 2 of Figure 1, C and D. An example of change in local firing rate from 0.1 spikes/s representing a yaw-suppressed state to 13 and 4 spikes/s, respectively, for roll- and pitch-induced excitatory states is shown by the insets in B.
Effect of distractor on yaw information rate
A more detailed analysis of spike train information takes into account how spike sequences are reproduced across trials. For pure yaw (σY = 100°/s), the RTot is 175 bits/s and the RNoise is 118 bits/s (Fig. 3A), resulting in an RInfo of 57 bits/s. With a combined roll (ξR = 10), RInfo drops to 38 bits/s, but RTot and RNoise both increase. Notably, the increase in RNoise is higher by almost a factor of 2 than the increase in RTot. This suggests that the interspike correlation decreases and the variability in spike sequences across trials increases. Because limited sampling affects RNoise more than RTot, the RNoise curve collapses earlier than the RTot curve (Strong et al., 1998).
Distractor-induced change in spike train entropy. A–C, Asymptotic changes in RTot, RNoise, and RInfo with increasing length of time window T are shown for one set of parametric values: repeated yaw at σY = 100°/s or ξR/P = 0 (A); repeated yaw with nonrepeated roll at ξR = 10 (B); and repeated yaw with nonrepeated pitch at ξP = 3 (C). Linearly extrapolated values for T → ∞ are marked by squares on the 1/T = 0 axes with the respective color codes. The time resolution for spikes was chosen as 1 ms. The difference between the extrapolated RTot and RNoise gives RInfo for 1/T = 0. Bootstrap errors in entropy estimates were <5%.
Spike train information rates at different distractor strengths parametrized by ξR and ξP are shown in Figure 4A. In the region where ξR increases from 1 to 10, RTot remains virtually unchanged at 181 bits/s, while RNoise increases from 122 to 143 bits/s. For ξR > 10, both RTot and RNoise steadily decrease. The onset of the decrease of RTot occurs earlier than the decrease of RNoise. This suggests that when the strength of the distractor is low, RTot increases due to a decrease in interspike correlations. When the strength of the distractor becomes very large, the mean firing rate drops. This leads to a lower number of spikes and therefore a smaller repertoire of binary words (W), which reduces both RNoise and RTot.
The yaw information rate decreases with increasing strength of the distractor. The change in RInfo (in bits per second) from spike train to repeated yaw with nonrepeated roll (A) and repeated yaw with nonrepeated pitch (B; Fig. 1C,D, segment 2). Entire ranges of ξR values from 100 to 102, and ξP values from 10−1 to 101 are used. The three color-coded curves (brown, blue, and red) stand for RTot, RNoise, and RInfo, respectively. Each point represents extrapolated values for the limiting case 1/T → 0, corresponding to the colored squares in Figure 3. The values of ξR and ξP highlighted by the gray shaded boxes will be used later as guides for illustrations in the subsection Spike-triggered covariance. The x-axis is shown on a logarithmic scale. The statistical error from bootstrap is <5%.
Over the range of ξR from 1 to 100, RInfo decreases monotonically from 57 to 8 bits/s (i.e., by a factor of 7; Fig. 4A). Over the range of ξP from 0.1 to 10, RInfo decreases from 58 to 22 bits/s (i.e., by a factor of 2.6; Fig. 4B). The decrease in RInfo for the two cases, however, is subtly different. While the decrease in RInfo for pitch occurs largely due to an increase in RNoise, for roll it occurs due to a decrease in both RTot and RNoise. Given the geometry of the display used in our experiments (Fig. 1B), the local tangential velocity components of roll at the edge of display is ∼3000°/s for a roll velocity of 10,000°/s (see Materials and Methods). Conceivably, the spatial contrast reduces at these high velocities, which could very well decrease both RTot and RNoise. The information rate shows approximately a linear dependence on the logarithm of ξR and ξP. The largest error in entropy estimates stem from the uncertainty in determining the slope of the extrapolated curves (Fig. 3). Although the absolute value of RInfo changes from one fly to the other, we found the characteristic dependence of RInfo on ξ to be consistent across all six tested flies.
Reverse correlation
The previous subsection quantified the effect of distractor motion on the information encoded about yaw. Here we sketch a more detailed picture of the interactions between yaw and distractor motion associated with spiking response. In principle, this is a problem of describing interactions in a high-dimensional space of stimulus waveforms, but in practice it is often the case that only a few dimensions are relevant. The analysis is based on a generalization of the reverse correlation method (de Boer and Kuyper, 1968), which starts by extracting a few stimulus dimensions that define the relevant subspace. Given that we use a symmetric Gaussian stimulus, this method is guaranteed to provide unbiased estimates of the linear filters, namely, the STA and the eigenvectors of the STC matrix (Chichilnisky, 2001; Paninski, 2003). To estimate the filters, we need many independent samples of the stimulus waveform. For this, we use data from the fifth segment (Fig. 1C,D) of the trial, where yaw and distractor waveforms are nonrepeated across trials. The results are illustrated using the following subset of parameter values: ξR = {0, 3, 10, 30} and ξP = {0, 1, 3, 10}.
Spike-triggered average
The first moment of the stimulus leading up to a spike is given by the STA. For calculating the STA, we use the 100 ms history of yaw and distractor motion preceding each spike (de Ruyter van Steveninck and Bialek, 1988).
As expected, the yaw STAs from both yaw–roll and yaw–pitch stimulus combinations are positive and unimodal, with a peak preceding the spike by ∼20 ± 2 ms. As ξR increases from 0 to 30, the yaw STA amplitude decreases to 45% of its amplitude at ξR = 0 (Fig. 5A). The roll STA (Fig. 5A, dotted line) is positive and relatively small, but shows signs of increase with an increase in ξR, which suggests that H1 is excited more by positive roll than by negative roll. Over the range of ξR ϵ [0, 30], the mean firing rate, r̄, decreases by <10%. The pitch STA (Fig. 5B, dotted line) has a downward peak, the amplitude of which increases with an increase in ξP. This could be due to horizontal motion components arising from the vertical movement of slanted pattern edges (Marr and Ullman, 1981). Additionally, projections from VS1 neuron could play a role in it (Haag and Borst, 2003). To test the dependence on pattern structures, we used a symmetric checkerboard pattern of square size 6° × 6° and contrast 1 (not shown). The resulting reduction in pitch STA (Fig. 5B, gray curve) indicates that the structure of patterns does play a significant role in modulating the response of H1 to vertical motion.
Yaw STA change induced by distractors. A, Yaw STAs from combined yaw–roll stimulus at four different values of ξR. The dotted line shows the roll STA at ξR (STAR) = 10. B, Yaw STAs from combined yaw–pitch stimulus at four different values of ξP. The dotted line shows the pitch STA (STAP) with the 2-D random pattern, and the gray line shows the pitch STA with a checkerboard pattern having the same spatial correlation length as the 2-D random pattern. These two illustrative curves are shown for the case ξP = 3. The insets in A and B show exponential fits (color coded) to the tail of the yaw STA curves.
To study the correspondence between the time-lagged mean stimulus and a spike, we fit an exponential to the tail of the yaw STA (Fig. 5A,B, insets). The decay constant of the exponential decreases from 40 to 24 ms as ξR increases from 3 to 30, while it decreases from 40 to 28 ms as ξP increases from 1 to 10. This suggests that as the distractor variance increases, spikes are more likely to be generated by stimulus features from the immediate past.
Spike-triggered covariance
We compute the joint yaw–distractor spike-triggered covariance around the respective yaw and distractor STAs (de Ruyter van Steveninck and Bialek, 1988). To illustrate the covariance structure, we choose parameter values ξR = 10 and ξP = 3, at which RInfo for yaw–roll and yaw–pitch stimulus combinations are comparable (Fig. 4, gray shaded boxes).
The structure of the temporal auto-covariance and cross-covariance between stimuli are displayed in the diagonal blocks (YY, RR, PP) and off-diagonal blocks (YR, YP), respectively, in Figure 6, A and B. The matrix values represent covariance calculated around the mean, or the STA. For example, negative covariance in the YR block indicates that spikes are associated with interactions between an increasing yaw and a decreasing roll around their respective means, or vice versa. The covariance structures indicate that spikes are associated with yaw, distractor, and yaw–distractor correlations. The extension of the structures indicates existence of short- and long-range temporal correlations, with the latter lasting for at least 50 ms.
Joint yaw–distractor covariance matrices and significant eigenvalues. A, B, The heat maps represent temporal covariances for the yaw–roll combinations at ξR = 10 (A) and yaw–pitch combinations at ξP = 3 (B). The covariances are shown for 100 ms before a spike (at t = 0 ms), where time is represented along x- and y-axes binned at 2 ms. The diagonal and off-diagonal matrix blocks represent auto-covariances YY, RR, and PP, and cross-covariances YR and YP, respectively (refer to Eq. 12). The color bar shows the range of covariance change from the prior. C, D, Eigenvalue spectra of the ΔC matrices shown in A and B. The black dashed lines represent the upper and lower bounds of eigenvalues based on bootstrap resampling of randomly shifted spike trains. The significant eigenvalues are shown by red circles. E, Exemplified for the case ξR = 10, significant eigenvalues of ΔC (red marker) lie outside the radius (R) of the Wigner semicircle. The grayed out areas show sections of the eigenvalue spectra (solid black circles) used for nonlinear least-squares fit to the theoretical cumulative Wigner distribution (solid green line). The left panel shows the frequency distribution of eigenvalues ρ(λ) in black, the Wigner semicircle in green, and the eigenvalue outliers in red. F, Spectrum and distribution of eigenvalues of the spike-independent covariance matrix CRnd obtained from an ensemble of N time windows chosen randomly from the prior stimulus. The radii of Wigner semicircles from the nonlinear least-squares fit for the cases depicted in E and F match within 1% of each other.
Using the cutoff set by the radius of the Wigner semicircle, we found one positive and two negative significant eigenvalues for the case ξR = 10, and one positive and three negative significant eigenvalues for the case ξP = 3 (Fig. 6C,D, red circles). For the other values of distractor variance, we found at the most one positive and four negative significant eigenvalues. This confirms that, even for a complex stimulus with multiple motion components, the relevant subspace has only a few dimensions. An interesting offshoot of this result, the significance of which will be discussed later in the section Relevant subspace information, is that spikes carry negative information about features defined by ê+.
The radius of the Wigner semicircle provides a hard cutoff for determining the relevant eigenvalues, unlike the bootstrap bounds, the accuracy of which depends on the number of bootstrap samples. To obtain the radius, we fit the set of irrelevant eigenvalues of ΔC from the central portion of the eigenvalue spectra (Fig. 6E, gray) to the theoretical cumulative Wigner distribution (Eq. 15). For the case ξR = 10 illustrated in Figure 6, E and F, the radius of the Wigner semicircle is r = 0.112. The three eigenvalues (Fig. 6E,F, red markers) lying outside the perimeter of the Wigner semicircle are identified as relevant. With N = 28,236 spikes and M = 100 dimensions, the analytic estimate of the radius turns out to be R = 2
Leading dimensions of yaw–distractor subspace
The eigenvectors corresponding to the leading eigenvalues of ΔC define the stimulus subspace most relevant to H1. To obtain an interpretation of the actual motion features that might trigger spiking, we examine the shapes of the three eigenvectors ê−1, ê−2 and ê+1 corresponding to the two leading negative and the leading positive eigenvalues (Fig. 7A–C). An exception is drawn for the pure yaw stimulus case, for which we did not find any positive eigenvalues.
Eigenvectors associated with the leading eigenvalues (λ) show increased distractor representation. A–C, The leading dimensions for the three cases with pure yaw: no positive λ value (A); yaw–roll combinations at ξR = {3, 10, 30} (B); and yaw–pitch combinations at ξP = {1, 3, 10} (C). The color codes red, and blue and green represent eigenvectors {ê+1} and {ê−1, ê−2}, corresponding to the largest positive and the two largest negative eigenvalues, respectively. Each eigenvector has a left yaw half (100 ms) and a right distractor half (100 ms), which are separated by the dashed vertical line. The time of spike occurrence is denoted by t = 0 ms.
The eigenvector ê+1 of the yaw–distractor space represents the direction of largest expansion of the stimulus space. For both yaw–roll and yaw–pitch combinations, we obtain one such direction. Since positive yaw velocity excites H1, the sign of each eigenvector is set by imposing the constraint that the yaw part of the curve has a positive rising peak (Fig. 7). The profile of distractor and yaw halves of ê+1 clearly shows that the effect of the distractor is much larger than the effect of yaw. The presence of a heavy tail suggests relatively long-lasting temporal effects. In addition, the yaw and distractor halves are monophasic, with their relative phases opposite to each other. This could point to a competition for generating spikes by preferential (or positive) roll, and downward pitch motion. We will discuss this in further detail in the next subsection.
The temporal structure of the eigenvector ê−1 suggests that it is dominated by yaw when the distractor variance is low, and by distractor when the distractor variance is high. The negative eigenvalue associated with ê−1 indicates a compression of stimulus space along the yaw STA and along a distractor direction that varies with ξ. This can be understood from the change in the temporal profile of the distractor half of ê−1, from a monophasic one to a triphasic one (Fig. 7B,C, blue curves). The triphasic profile demonstrates the presence of positive and negative distractor correlations with different time lags, thus representing the feature “jerk.” Since the eigenvector ê−2 resembles the derivative of the STA, it represents the feature “acceleration.”
To estimate the change in yaw variance induced solely by distractor, eigenvalue decomposition was performed on the YY block (Eq. 12) of the ΔC matrix, which contains only yaw correlations. Table 1 shows that the magnitude of the largest negative eigenvalue of yaw stimulus space decreases rapidly with an increase in distractor variance. Over the full range of ξR and ξP, the corresponding eigenvalues decrease by 86% and 80%, respectively. Thus, spike coupling to yaw stimulus is markedly weakened by an increase in distractor fluctuations.
Leading eigenvalues of the yaw subspace
Response as a function of relevant stimulus
The input–output relationship (Eq. 17) helps bypass the complexities of sensory processing in the intermediate stages and provides the instantaneous firing rate based on the stimulus distribution. To obtain the input–output mapping, we project the spike-triggered and prior stimuli on each of the yaw and distractor halves of the eigenvectors. The 1-D and 2-D distribution of projections are then used to compute the corresponding response curves (Brenner et al., 2000a).
First, we focus on how the stimulus feature associated with ê−1 impacts the firing rate. The response curve obtained using the yaw half of ê−1 shows a characteristic sigmoidal increase, which eventually plateaus at approximately three times the mean firing rate (Fig. 8A,C, top). This is a typical rectifier-type response indicating the direction selectivity of H1. Unlike negative yaw, negative distractor does not completely suppress the response of H1, as indicated by the firing rate, which remains at the mean firing level (Fig. 8A,C, left). This suggests excitation arising from a relatively weak distractor in an otherwise yaw-dominated direction. For positive distractor projections, the firing rate initially increases but declines when ξ becomes large, with the latter clearly evident at ξR = 30 (Fig. 8A,C, left, blue curve). Considering that the actual SD of roll in this case is 30 times that of yaw, it is reasonable to expect that such velocities would reduce image contrast and wash out otherwise detectable features, which in turn would diminish the response. An overall assessment of the response curves indicates a broadening of yaw posterior and a narrowing of distractor posterior as ξR or ξP increases. If we define stimulus sensitivity as the maximum slope of the stimulus–response curve, then this indicates an increase in yaw sensitivity and a decrease in distractor sensitivity.
Normalized response as a function of stimulus projection on the leading stimulus dimensions ê+1 and ê−1. A–D, Response curves obtained from ê−1 corresponding to λ < 0 are color coded blue (A, C), and those obtained from ê+1 corresponding to λ > 0 are color coded red (B, D). The 2-D input–output (I/O) heat map, marginal prior (dashed black) and posterior (solid black) distributions (pdfs), and response curves (solid red, blue) derived from them are shown for each of these two cases. The prior and posterior pdfs are generated by projecting yaw and distractor modes of each eigenvector on to the respective spike-triggered yaw and distractor stimuli. The projection values are denoted by sY, sR, and sP for yaw, roll, and pitch. The horizontal and vertical axes of the 2-D heat map have units of yaw and distractor projections. The y-axis of the response curve shows the normalized firing rate, . The parameter values ξR = {3, 10, 30} and ξP = {1, 3, 10} are chosen to highlight the change in nonlinearity with increasing distractor variance. The inset in B (ξR = 10) shows a peak as well as a shoulder (gray shaded regions) in the spike-triggered distributions obtained from pure roll STA at σR = 1000°/s and from ê+1R. The error bands (color coded light red and blue) represent the SEM from 100 bootstrap samples.
It is essential to set a reference for deriving the sign of the physical velocity from the sign of the velocity projection. As the STA shape is a good indicator of the direction of the mean velocity that is preferred by the neuron, we shall use it as our reference. It is clear that the spike-triggered roll and yaw distributions peak respectively at negative and positive projections of ê+1 (Fig. 8B). This, however, fails to indicate the direction of the physical roll or yaw velocity that excites or suppresses the neuron. Therefore, we compared the roll posterior distribution obtained from the eigenvector ê+1R with that obtained from the STA of a pure roll stimulus, at the same roll variance (Fig. 8B, inset). As shown, the two curves are asymmetric with a large peak and a prominent shoulder. Because the sign of the roll STA is positive and the peak of the roll posterior obtained from projecting the spike-triggered roll on the positive designated roll half of ê+1 lies on the negative side of the projection axis, the negative roll projection value for ê+1 corresponds to physical positive roll velocities. The peak indicates strong excitation by positive [counterclockwise (CCW)] roll, and the shoulder indicates weak excitation from negative [clockwise (CW)] roll. The asymmetric U-shaped nonlinearity (Fig. 8B, left) demonstrates the bidirectional response to roll.
For the yaw–pitch case (Fig. 8D), the firing rate mainly increases with an increase in negative pitch projection values. Since the pitch STA is negative, using the same line of argument as above, we find that downward-directed pitch motion corresponds to negative projection values for ê+1. Thus, the feature represented by ê+1, associated with an excitatory response, is largely a reflection of the pitch STA.
A topographical map of the 2-D input–output relationship allows us to examine the firing rate dependence on the interaction of the relevant features. The response profile obtained from projections on ê+1Y, ê+1R and ê−1Y, ê−1P display a bean-like shape, whose center is located in the positive half of the yaw projection space, with the two ends extending out into the positive and negative halves of the distractor projection space (Fig. 8A,C). As the distractor variance increases, the two ends gradually fade away, leaving only the center. This indicates the dominance of yaw stimulus along ê−1, the effect of which lingers even at high distractor variance, unlike the distractor response, which is weak and ultimately vanishes. Interestingly for ê+1, the bean extends farther into the two diagonally opposite quadrants (Fig. 8B,D). This implies that a coupling between strong preferred distractor and weak positive or negative yaw, and between weak nonpreferred distractor and strong positive yaw, increases the firing rate. Overall, these results demonstrate the presence of excitatory components in distractor motion and a competitive inhibition of the spiking response to yaw caused by strong preferred distractor motion components.
Stimulus energy shapes input–output response
Direct visual inspection of the 2-D input–output relations of Figure 8 shows that they are not separable as the product of two marginal functions. Moreover, their shape changes with the value of ξ. Here we examine those relations in more detail by drawing various 1-D sections. Sections parallel to the horizontal axis describe how yaw encoding depends on the amount of distractor motion, as measured by its projection on the eigenvector. Conversely, vertical sections describe the encoding of the distractor at different values of yaw projection.
First, we focus on the yaw response when the distractor projection is zero (i.e., sR,P = 0). This means that the instantaneous distractor energy is zero, but the global distractor energy is not. The input–output curves exhibit two important features. First, with increasing values of ξR,P the yaw response curves flatten (Fig. 9A,D, left). Second, to a good approximation the input–output curves all cross at a common point, sY = scross = 0.5, where the slope of the curves is maximal (Fig. 9A,D, right). Based on previous findings that the gain of H1 scales with the SD of the yaw stimulus (Brenner et al., 2000a), one might expect similar scaling to exist for a combined stimulus. To test this hypothesis, we scaled the input–output curves with the “total SD,” but allowing different weights for each stimulus. Analytically, the scale factor is written as SF =
Response curves for yaw and distractor at different yaw–distractor energies. A, D, The plots on the left show yaw input–output curves for pure yaw stimulus (brown), and for combined stimulus at zero distractor projection value (green). For illustration, four different variances ξR = 1, 3, 10, 30 and ξP = 1, 3, 5, 10 are chosen for roll and pitch, respectively. The plots on the right show the scaled version of the input–output curves, where the projection axes are scaled by and
around the coordinates rN = 1.2, sY = 0.5, shown by the yellow circle. Here scross = 0.5, and SFR, SFP represent scale factors for roll and pitch. The fitting parameter is unchanged for pitch (αP = 0.1), but for roll (αR) it varies from 0.03 to 0.3. B, E, Yaw response conditional on three different distractor projection values (−2, 0, 2) corresponding to the three horizontal cuts on the 2-D I/O map from ê−1. C, F, Distractor response conditional on three different yaw projection values (−2, 0, 2) corresponding to the three vertical cuts on the 2-D I/O map from ê+1. The color codes blue, green, and red stand for the three projection values, and brown stands for the response to pure stimulus. The yaw, roll, and pitch projection values are denoted respectively by sY, sR, and sP. The y-axis represents the normalized firing rate (rN). The light-colored bands around the solid mean lines represent the SD calculated over a range of 0.2 around the conditional values −2, 0, 2.
We can also look at the yaw input–output curves that are conditional on nonzero values of distractor projection. Representative examples are shown in Figure 9, B and E (dominant-negative eigenvalue). Here the red, green, and blue curves are horizontal slices through the 2-D input–output maps at projection values of −2, 0, and 2 SDs of the distractor. These conditions measure instantaneous projections on the distractor modes. Compared with the yaw response curves that are conditional on zero distractor projection (Fig. 9A,D), the response curves for nonzero distractor projections become more flattened and shift along the horizontal axis (Fig. 9B,E). The direction of shift is opposite for the opposite sign of the projections. An earlier result that shows that H1 is weakly sensitive to mean distractor velocity (Fig. 5A,B) provides a clue to this behavior. A rightward or leftward shift in the threshold indicates the presence of large instantaneous distractor energy associated with motion components that primarily compete or act in unison with excitatory yaw components, such that in one case higher than normal yaw excitation is required to elicit a response, and vice versa for the other case. The flattening occurs due to an overall reduction in the sensitivity to yaw in the presence of strong instantaneous distractor energy, thus diminishing the gain to yaw.
Finally, Figure 9, C and F, shows distractor input–output curves conditional on different yaw projections for the eigenvector ê+1, as well as the pure distractor input–output behavior. One interesting feature for the roll input–output curve is that it has an asymmetric “U” shape, which means that spikes are generated preferentially for large negative or large positive projections on the distractor half of ê+1. This behavior is related to a broadening of the posterior distribution compared with the prior, resulting in an increased variance, which is also related to the positive sign of the eigenvalue. The asymmetry in the U shape of the roll response curves suggests unequal excitation from CCW and CW roll. A small signature of the U shape is also found in the pitch response curves, and we shall return to a possible explanation for this behavior in the Discussion. The vertical shift of the input–output curves here may be expected simply from the overall effect of yaw on spiking probability.
Relevant subspace information
We begin with the caveat that eigenvalues provide an incomplete description of the posterior distribution because it fails to capture statistical features beyond the second order. Stimulus-specific information per spike (Eq. 18) is a more appropriate choice because it is independent of assumptions about the statistics of distribution and allows us to probe the existence of a synergistic relationship (Brenner et al., 2000b; Schneidman et al., 2003) between yaw and distractor features. Data limitations make it difficult to sample a high-dimensional stimulus space (Rieke et al., 1997; Dayan and Abbott, 2001), so we limit the analysis to a 2-D subspace. The leading stimulus dimensions corresponding to ê−1 and ê+1 are used for the analysis.
Figure 10A shows that spike information about yaw decreases asymptotically from 0.53 bits to zero as ξR increases, while spike information about roll and pitch increases from zero to 0.21 and 0.18 bits, respectively (Fig. 10B). Thus, maximum roll and pitch information are respectively 40% and 35% of the maximum yaw information. Since the curves do not plateau, it is conceivable that saturation occurs at yet larger distractor variance. Overall, this characterizes a transition from a yaw-encoding regime to a distractor-encoding regime. Although yaw dominates ê−1, nonzero distractor information indicates that when distractor variance is high, distractor motion overrides yaw motion in driving spike generation. In contrast, spikes carry negative information, up to −0.22 bits about roll and up to −0.14 bits about pitch, corresponding to the ê+1 dimension (Fig. 10C,D). Note that these measures correspond to the difference between posterior and prior entropies, also termed stimulus-specific information (see DeWeese and Meister, 1999; see Materials and Methods). Recalling the shape of nonlinearity (Fig. 8B,D, left), the negative sign results from higher entropy of a widened posterior distribution. We found no evidence of negative yaw information carried by spikes. The mutual information (Eq. 21) is necessarily positive, and is respectively 0.034 and 0.017 bits for ê−1Y and ê−1R. We also found that the information estimate from the 2-D stimulus subspace spanned by ê−1Y and ê−1R, or ê−1Y and ê−1P, was equal to the sum of information estimates from the individual subspaces, within statistical error. This rules out the synergistic effect between yaw and distractor features.
Spike information from the leading dimensions of the yaw–distractor subspace. The two leading eigenvectors of ΔC matrix, ê+1 and ê−1, are split into respective yaw and distractor halves. Each of the halves is used to construct the marginal and joint probability distributions for estimating information (Eq. 18). A–D, Information (in bits per spike) as a function of ξR/P corresponding to the eigenvectors ê+1 and ê−1 is shown for the yaw–roll combined stimulus (A, C) and the yaw–pitch combined stimulus (B, D). The three curves distinguished by the three marker types correspond to the 1-D yaw subspace (filled triangle), the 1-D distractor subspace (filled square), and the 2-D yaw–distractor subspace (empty circle). Missing values in C and D are due to the absence of λ > 0 at the corresponding values of ξR and ξP. Error bars represent the SEM (100 bootstrap samples).
A comparison of information estimates
Stimulus information is conveyed both by individual spikes and by spike patterns (de Ruyter van Steveninck and Bialek, 1988; Schneidman et al., 2011). The excess information that spike patterns carry over single spikes (Brenner et al., 2000b) can be obtained by comparing the estimates from Equations 3 and 4 (see Materials and Methods). For example, the spike train conveys 1.8 bits per spike, while single spikes convey 1.25 bits about the repeated yaw stimulus at ξR = 1. Thus, single spikes account for 70% of the spike train information at ξR = 1, which eventually decreases to 47% at large roll variance (Fig. 11A). For pitch, the decrease is from 79% to 45% (Fig. 11B). A decrease in spike-timing precision to repeated yaw, caused by a distractor competing with yaw for available spikes, may play a dominant role here. The overall dependence of information rate on ξ is approximately logarithmic (Fig. 11A,B).
Spike train information compared with stimulus subspace information, at different distractor strengths. A, B, Spike train information (RInfo:yaw) color coded red and single-spike information (ISpk; Eq. 4) color coded brown shown for the combined stimulus of repeated yaw and nonrepeated distractor. Information per spike (Eq. 18) from the yaw and roll modes of ê−1 are shown color coded green and blue respectively. C, D, Spike train information (RInfo:roll, RInfo:pitch) color coded red and single-spike information (ISpk; Eq. 4) color coded brown are shown for the combined stimulus of repeated roll or pitch with nonrepeated yaw. The x-axis shows the range of ξR and ξP on a logarithmic scale.
We can assess the completeness of the feature description by comparing the information associated with features (Eq. 18), with single-spike information (Eq. 4). However, such a comparison requires caution because specific distractor features contribute negative symbol information, unlike the estimate from Equation 4, which is always positive. Therefore, for comparison, we choose only those features that contribute positive information. The leading yaw dimension (ê−1Y) of the yaw–roll and yaw–pitch subspaces contributes a maximum of 42% (0.52 bits) and 36% (0.45 bits), respectively, of the single-spike yaw information (Fig. 11A,B). The joint yaw space constructed from the leading yaw dimensions ê−1Y and ê−2Y contributes up to 53% of single-spike yaw information. The information we obtained from other relevant dimensions, such as ê−2, ê−3, were lower and decreased with decreasing significance of the eigenvalues.
Since distractor features contribute to spike generation (Fig. 10), it is conceivable that spike patterns carry significant distractor information. We test this by using a presentation of repeated distractor with nonrepeated yaw across trials (Fig. 1C,D, segment 6). Figure 11C shows that spike train information rises to 1 bit/spike, while single-spike information rises to 0.8 bits for roll. For pitch, the spike train information rises to 0.63 bits/spike, and single-spike information rises to 0.60 bits/spike (Fig. 11D). As a comparison, the joint feature space of ê−1R and ê−2R contributes 0.35 bits, and the joint feature space of ê−1P and ê−2P contributes 0.37 bits (Fig. 11A,B). Even if we add the information associated with each of the relevant dimensions, the sum is still less than what we obtain for single spikes from the spike train (data not shown). This indicates that the spike train carries information about stimulus features beyond those identified from the relevant subspace, and it is possible that those features are associated with particular spike combinations.
Discussion
In this article, we provide a quantitative analysis of preferred motion coding from stimuli not aligned with the preferred direction of the neuron. The presence of distractor reduces the efficiency of coding the yaw of H1, but increases the efficiency of coding the distractor or, specifically, the features of the distractor that excite H1. Geometry of the spatial pattern, disparity in the strength of local excitation and suppression, and blurring are some of the factors that may impact the efficiency of coding preferred motion. Our results provide insight into how the response is shaped, not only by the stimuli statistics, but also by the complexity of the optic flow. The method used here can be generalized to studies in other systems, including, but not limited to, the coding of complex, multidimensional stimulus.
It has been shown that H1 can operate with a reliability close to the statistical limits set by noise in the visual input (de Ruyter van Steveninck and Bialek, 1995). Other, internal, noise sources (Schneidman et al., 1998; Manwani and Koch, 1999; White et al., 2000) may also limit the information communicated about the stimulus. How large is the distractor noise in comparison with noise from these other sources? We found that distractor motion acts as the primary source of noise for yaw (“signal”) encoding, dominating noise from other sources (Fig. 3). If this were not the case, then we would have observed a minimal decrease in spike train information with an increase in distractor variance. The most direct effect of this distractor noise is a decrease in information rate, which is caused by a reduction in firing precision to repeated yaw (Bialek et al., 1991) at moderate distractor variances, and a reduction in both firing precision and firing rate at very large distractor variances (Optican and Richmond, 1987; Bialek et al., 1991; Victor and Purpura, 1996; Warland et al., 1997).
There are two possible explanations for the above findings. First, at low-to-moderate distractor variance, local variation of yaw components is induced by the distractor due to a combination of the aperture effect (Marr and Ullman, 1981) and varying local flow directions, the latter in the case of roll. The resulting ambiguity between components of pure yaw and those induced by distractor cannot be resolved by a local measurement. Rather, the percept of a wide-field motion is built out of many such local ambiguous measurements. This effectively induces noise, which becomes stronger as distractor variance increases. Second, at very high distractor variance, rapid fluctuations from the distractor cause blurring, leading to diminished image contrast. Since the incoming visual signal is low-pass filtered by photoreceptors (Howard et al., 1987), only low spatial frequencies survive at those high-velocity amplitudes. With a typical spatial correlation length of 6° (Fig. 1A) and a pitch SD of 1000°/s, the correlation time is 6/1000 = 6 ms, which is of the same order as the photoreceptor integration time (Laughlin and Weckstrom, 1993). The decrease in RTot may be related to these effects.
Previous work suggests that optic flow from natural flight (Kern et al., 2001; Boeddeker et al., 2005; van Hateren et al., 2005; Karmeier et al., 2006) and higher-order motion stimulus (Quenzer and Zanker, 1991; Lee and Nordström, 2012) contain motion components that can evoke neural response. A strong enough motion along a nonoptimal axis can also induce neural excitation (Karmeier et al., 2003). Our results provide evidence that the structure of the optic flow field shaped by the combined motion plays a crucial role in preferred motion encoding. Since roll generates local motion, with yaw and pitch components speeding up as the distance increases from the center of rotation, it induces both excitation and suppression in H1. But a net positive roll STA (Fig. 5A) indicates that excitation is stronger in the ventral half than in the dorsal half, corroborating earlier observations (Eckert, 1980; Hausen, 1981). This means that effective excitation from counterclockwise roll (ventral excitation with dorsal inhibition) overrides effective inhibition from clockwise roll (dorsal excitation with ventral inhibition). Excitation and inhibition could also have different levels of saturation, depending on the stimulus strength (Borst and Egelhaaf, 1990), such that with a strong enough clockwise roll, inhibition in the ventral half cannot overcome excitation in the dorsal half. This may qualitatively describe the asymmetric U shape of the roll response curves (Fig. 9C).
The geometry and size of the spatial pattern have also been reported to impact motion response, both in insects (Borst et al., 1993; Meyer et al., 2011; O'Carroll et al., 2011; Saleem et al., 2012) and in vertebrates (Rodman and Albright, 1989). The response of H1 to pitch motion in our case can perhaps be attributed to two factors. First, the apparent horizontal motion induced by vertical movement of slanted pattern edges (i.e., the aperture effect; Marr and Ullman, 1981) can induce excitation of H1, and an excitation/inhibition asymmetry, as noted above, may result in residual net effects. Second, excitatory input from ipsilateral VS1 to H1 also can increase the response sensitivity to downward pitch (Haag and Borst, 2003).
What motion features does H1 encode? The eigenvector profiles demonstrate that H1 preferentially encodes stimulus features such as velocity, acceleration, and jerk, among others (Fig. 7), which was also reported in the study by Brenner et al. (2000a). These types of features are also coded by motion-sensitive neurons in pigeons (Cao et al., 2004) and parieto-insular vestibular cortex neurons in macaques (Chen et al., 2010). Our results further establish that these features carry a large portion of stimulus information. For example, the 2-D subspace spanned by ê−1Y and ê−2Y contributes 40% of the spike train yaw information. This raises the question of whether natural stimuli are better alternatives to Gaussian stimuli. Studies on fly motion-sensitive neurons under natural conditions (Lewen et al., 2001) and using natural stimulus (Karmeier et al., 2006) suggest that neurons encode features beyond velocity. Further, natural stimuli have been linked to a higher coding efficiency in auditory systems (Rieke et al., 1995; Escabí et al., 2003) and visual systems (Vinje and Gallant, 2002; Felsen et al., 2005), indicating the presence of a richer repertoire of features in such stimuli (Dong and Atick, 1995). However, the statistical complexity of natural stimuli and the strong response nonlinearities make the analyses nontrivial (Egelhaaf and Borst, 1989; van Hateren, 1997; O'Carroll et al., 2011). Nonparametric methods such as Maximally Informative Dimensions (Sharpee et al., 2004) and generalizations of it (Rajan and Bialek, 2013) may help identify which features are relevant and how informative those features are.
It is well known that the visual system adapts to stimulus properties, such as contrast and stimulus variance (Maddess and Laughlin, 1985; Smirnakis et al., 1997; Brenner et al., 2000a; Harris et al., 2000; Fairhall et al., 2001; Baccus and Meister, 2002). This has been linked to efficient signal coding (Barlow, 1961) and an increase in neural information throughput (Laughlin, 1981; Brenner et al., 2000a; Safran et al., 2007). We find that H1 adapts its response to yaw in a way that depends on the global distractor variance even when the instantaneous distractor energy is zero. The collapse of scaled yaw response curves onto a single curve (Fig. 9A,D) may indicate a universal strategy for scaling. In contrast, a change in the fitting parameter for roll (αR) suggests that scaling depends crucially on the structure of the optic flow, which determines the effective wide-field excitation, and not on the stimulus variance alone. Overall, these empirical findings provide insight into how a system might adapt to the combined dynamics of different stimulus components. Whether, and to what extent, this can be understood in a framework of optimal coding (Laughlin, 1981) remains an open question. Since different directions of motion are relevant here, a complete answer would require an analysis of coding across multiple neurons, which would necessitate experiments that simultaneously record from motion-sensitive cells with different directional selectivities (see also van Hateren, 1990).
In summary, we have characterized the features encoded by H1 from a stimulus with competing motion components and have shown how such features shape the neural response. Although scaling of the output to the collective dynamic range of the input might be viewed as a strategy for optimizing information transmission about the preferred motion buried in a complex stimulus, further studies are needed, with multiple neurons with different directional selectivity, to validate this argument. The Gaussian white noise stimulus imposes an obvious limitation in our study, because it lacks the statistical variability and functional relevance of a natural stimulus. But its merit lies in the mathematical tractability (Marmarelis and Marmarelis, 1978), which allows complete characterization of the stimulus feature space. A natural next step is to incorporate stimuli with higher-order correlations and, ultimately, natural stimuli into this information-theoretic framework. Modeling (Prenger et al., 2004; Butts et al., 2011) and nonparametric approaches (Sharpee et al., 2004) will prove particularly useful in such studies for characterizing the physiologically relevant features and their neural representation.
Footnotes
We thank Anne C. Mennen for participating in experiments [supported by National Science Foundation Grant 1156540 for the Research Experiences for Undergraduates (REU) Program]; and Philip L. Childress for technical support.
The authors declare no competing financial interests.
- Correspondence should be addressed to Rob de Ruyter van Steveninck, Department of Physics, Indiana University Bloomington, Bloomington, IN 47405. deruyter{at}indiana.edu