The Journal of Neuroscience, August 18, 2004, 24(33):7305-7323; doi:10.1523/JNEUROSCI.0554-04.2004
Previous Article | Next Article 
Behavioral/Systems/Cognitive
Adaptive Temporal Integration of Motion in Direction-Selective Neurons in Macaque Visual Cortex
Wyeth Bair1,2 and
J. Anthony Movshon1
1Center for Neural Science, New York University, New York, New York 10003, and 2University Laboratory of Physiology, Oxford OX1 3PT, United Kingdom
 |
Abstract
|
|---|
Direction-selective neurons in the primary visual cortex (V1) and the extrastriate motion area MT/V5 constitute a critical channel that links early cortical mechanisms of spatiotemporal integration to downstream signals that underlie motion perception. We studied how temporal integration in direction-selective cells depends on speed, spatial frequency (SF), and contrast using randomly moving sinusoidal gratings and spike-triggered average (STA) analysis. The window of temporal integration revealed by the STAs varied substantially with stimulus parameters, extending farther back in time for slow motion, high SF, and low contrast. At low speeds and high SF, STA peaks were larger, indicating that a single spike often conveyed more information about the stimulus under conditions in which the mean firing rate was very low. The observed trends were similar in V1 and MT and offer a physiological correlate for a large body of psychophysical data on temporal integration. We applied the same visual stimuli to a model of motion detection based on oriented linear filters (a motion energy model) that incorporated an integrate-and-fire mechanism and found that it did not account for the neuronal data. Our results show that cortical motion processing in V1 and in MT is highly nonlinear and stimulus dependent. They cast considerable doubt on the ability of simple oriented filter models to account for the output of direction-selective neurons in a general manner. Finally, they suggest that spike rate tuning functions may miss important aspects of the neural coding of motion for stimulus conditions that evoke low firing rates.
Key words: macaque monkey; primary visual cortex; area MT; area V5; visual motion; direction selectivity; temporal integration; white noise; reverse correlation; spike-triggered average; spatial frequency; temporal frequency; contrast; information theory; integrate-and-fire model
 |
Introduction
|
|---|
Motivated by the idea that the visual cortex is a spatial frequency (SF) and temporal frequency (TF) analyzer, the responses of direction-selective (DS) neurons are commonly modeled using linear filters that are oriented in space-time (Fahle and Poggio, 1981
; Watson and Ahumada, 1983
; van Santen and Sperling, 1984
; Adelson and Bergen, 1985
). These models have gained wide use in physiologically inspired computer simulations of motion perception (Heeger, 1987
; Grzywacz and Yuille, 1990
; Nowlan and Sejnowski, 1995
; Simoncelli and Heeger, 1998
) and have received additional support from experimental studies (Reid et al., 1991
; Emerson et al., 1992
; Emerson, 1997
; De Valois et al., 2000
; Touryan et al., 2002
). If the response of a DS neuron can be described effectively by such simple combinations of spatiotemporal filters, then the envelop of the filter, essentially the receptive field (RF) profile, should be stable for a given cell and easily mapped in space and time (Touryan et al., 2002
).
However, psychophysical studies show that the temporal profile of motion integration is not stable but varies with stimulus speed, SF, and contrast (Nachmias, 1967
; Vassilev and Mitov, 1976
; Breitmeyer and Ganz, 1977
; Thompson, 1982
; Van Doorn and Koenderink, 1982
; De Bruyn and Orban, 1988
; Müller and Greenlee, 1998
; Burr and Corsale, 2001
; Vassilev et al., 2002
). Could these stimulus-related changes at the perceptual level originate from changes in the properties of single cortical DS cells, or do they simply reflect a population of diverse, but individually fixed, temporal RF profiles? Fixed RFs are consistent with demonstrations that linear spatiotemporal filters account well for response properties including direction selectivity in V1 simple cells (Movshon et al., 1978
; Reid et al., 1987
; McLean and Palmer, 1989
, 1994
; DeAngelis et al., 1993
), but there have been some reports of stimulus-related changes in temporal integration in simple and complex cells in cats and monkeys (Dean et al., 1982
; Reid et al., 1992
).
To determine whether the temporal RFs of DS cells are fixed, we presented randomly moving stimuli, essentially coarse approximations to white noise in the velocity domain (de Ruyter van Steveninck and Bialek, 1988
; Bair et al., 1997
; Bura
as et al., 1998
; Borghuis et al., 2003
), and computed spike-triggered averages (STAs) (de Boer and Kuyper, 1968
) to estimate first-order profiles of temporal integration across multiple stimulus conditions. We tested complex DS cells in V1 because V1 is where direction selectivity originates (Hubel and Wiesel, 1962
) and these cells have been closely compared with sets of motion filters. We also tested DS cells in MT/V5 (Zeki, 1974
), which have much larger RFs (Gattass and Gross, 1981
; Albright and Desimone, 1987
) and can be selective for global pattern motion (Movshon et al., 1985
). This allowed us to compare responses in V1 with those at a higher level in which DS responses have been closely linked to motion perception (for review, see Parker and Newsome, 1998
). We did not examine DS simple cells because the spatial phase dependence of their responses calls for a more elaborate stimulus paradigm and makes them less directly comparable with MT cells. In both V1 and MT, we found that the temporal profiles reflected by the STAs changed substantially with the spatiotemporal structure and contrast of the stimuli. We also presented our visual stimuli to a model DS unit consisting of a set of motion energy (ME) filters (Adelson and Bergen, 1985
) and an integrate-and-fire (IF) spiking mechanism. The model did not capture the changes in the STAs observed in the vast majority of our DS neurons. Our results strongly suggest that DS responses in V1 and MT cannot be accounted for by standard models with fixed profiles of temporal integration. Rather, the responses reflect a system that changes its integration properties with stimulus parameters in a manner consistent with psychophysical observations.
 |
Materials and Methods
|
|---|
Electrophysiology
We recorded extracellularly from the primary visual cortex of anesthetized, paralyzed macaque monkeys (two Macaca nemestrina and eight Macaca fascicularis). Detailed methods for this type of recording were described by Cavanaugh et al. (2002
). Experiments typically lasted from 4 to 5 d, during which anesthesia and paralysis were maintained with sufentanil citrate (4-6 µg/kg/hr) and vecuronium bromide (Norcuron; 0.1 mg/kg/hr), respectively, administered in lactated Ringer's solution (8 ml/kg/hr) containing dextrose (2.5%). Artificial respiration with moist room air was maintained with rate adjustments to keep expired CO2 between 3.8 and 4.0%. Body temperature was maintained near 37°C with a heating pad. EEGs and electrocardiograms were monitored to ensure proper depth of anesthesia. All procedures conformed to guidelines of the New York University Animal Welfare Committee.
Tungsten-in-glass microelectrodes (Merrill and Ainsworth, 1972
) were advanced with a hydraulic microdrive downward through a craniotomy of 9-10 mm diameter. In some experiments, we used a mechanical microdrive system with quartz-platinum/tungsten microelectrodes (Thomas Recordings, Marburg, Germany). For V1, the craniotomy was typically centered 4 mm posterior to the lunate sulcus and 10 mm lateral to the midline. Neurons in V1 were recorded both on the operculum and in the calcarine sulcus (typical RF eccentricities were 1-6° and 8-20°, respectively). For MT, the craniotomy was centered 15 mm lateral to the midline, 4 mm posterior to the lunate sulcus, and the angle of advance was 20° down (relative to horizontal) and forward in the parasaggital plane. Action potentials were detected using a hardware dual-window time-amplitude discriminator (Bak, Mount Airy, MD) and time stamped at a resolution of 0.25 msec. Electrolytic lesions (2 µA for 2-5 sec) were made for histological verification and estimation of cortical layer. At the end of experiments, animals were given an overdose of sodium pentobarbitol (30-60 mg/kg), exsanguinated through the heart, and perfused with 4% paraformaldehyde in saline.
Visual stimuli
Visual stimuli were generated by custom software on a CRS 2/2 Board (Cambridge Research Systems, Kent, UK) and presented on a standard cathode ray tube (CRT) at 100 Hz vertical refresh with a mean luminance 33 cd/m2. The CRT was placed farther from the monkey's eye for smaller neuronal RFs and closer for larger RFs. The distance ranged from 80 to 180 cm, for which the screen covered
10-20° of visual angle. Stimuli were presented on a mean gray background and, except where noted, at 100% Michelson contrast (100% nominal contrast is
98% actual contrast because the minimum luminance on our CRT was
0.5 cd/m2).
Basic characterization with drifting sine waves. We mapped the RF for each cell by hand with patches of drifting sinusoidal gratings to estimate values of four parameters (orientation, SF, TF, and patch size) of the grating that maximized the firing rate of the cell. We then used a small patch of optimal grating to determine the RF center. After hand mapping, we ran four computer-controlled experiments to systematically and sequentially optimize the four stimulus parameters in the order listed above. In each experiment, trials were interleaved in a blockwise random manner, and grating stimuli were presented in a circular aperture for 2-4 sec with 2 sec of mean gray between trials. Direction of motion, which was always perpendicular to orientation, was tested at 22.5° increments. We will refer to the direction eliciting the largest response as the preferred direction and that 180° opposite as antipreferred. SF was tested at half-octave steps over a five-octave range that was approximately centered on the optimal SF. TF was tested in octave increments from 0.2 to 25 Hz. Finally, the diameter of the grating patch was tested over a five-octave range. We defined the classical RF size to be the smallest diameter that produced at least 95% of the maximum response (for details, see Cavanaugh et al., 2002
).
We classified cells in V1 as simple or complex on the basis of their modulation index, MI = F1/DC, in response to an optimal drifting grating (Skottun et al., 1991
). Here, DC is the mean evoked firing rate (in excess of the spontaneous firing rate to a mean gray field), and F1 is the amplitude of the Fourier component of the response at the stimulus TF. We will refer to V1 cells with MI
1 as complex. For all cells, we computed a direction index, DI = 1 - a/p, where p and a are the evoked firing rates for the preferred and antipreferred directions of motion (Maunsell and Van Essen, 1983
). If a cell fired equally for both directions, then DI = 0. Cells that were strongly direction selective had a DI near 1. Values of DI > 1 indicate that the antipreferred stimulus suppressed the firing rate below the spontaneous rate. All cells studied here had DI > 0.7 (two of the V1 cells had DI < 0.8) and will be referred to as direction selective.
Random motion stimuli. After the initial characterization, we tested each DS cell with dynamic stimuli in which an optimally oriented sinusoidal grating moved randomly back and forth along the axis of preferred motion of the cell. Specifically, the spatial phase of the grating was shifted between successive video frames (every 10 msec) by either +
or -
, where
was fixed and
1/4 spatial cycle of the grating. A shift of +
generated motion in the preferred direction, whereas a shift of -
generated an antipreferred motion. Figure 1 A shows a sequence of four stimulus frames (numbered 1-4) in which the grating moves in the antipreferred direction (downward) by 1/4 cycle between the first and second frames (downward arrow) and then moves in the preferred direction (upward) on the next two frames (upward arrows). The stimulus performed a binary random walk along the axis of preferred motion, and the movements were governed by a pseudorandom sequence generated either from the ran2 algorithm of Press et al. (1992
) or from a binary m-sequence (Sutter, 1987
; Reid et al., 1997
).

View larger version (25K):
[in this window]
[in a new window]
|
Figure 1. The random motion stimulus and the computation of the STA. A, A sequence of four frames of the visual stimulus in which the optimal grating is shifted by 90° of phase in either the preferred (upward; up arrows) or antipreferred (downward; down arrow) direction between frames. B, The impulse representation of a 500 msec segment of stimulus is shown. The amplitude and sign of the impulses represent the size and direction of the motion (the displacement of the grating) between frames. C, The boxcar representation of the same stimulus segment takes the value 1 or -1 if the most recent movement was preferred or antipreferred, respectively. It can be constructed by convolving the function in B with a 10 msec wide boxcar. D, STAs were computed for spike trains from a model with a square window of temporal integration of duration 20 msec. The STA (thick line) computed from the boxcar stimulus was smoother than that (thin line) computed from the impulse stimulus. E, STAs computed for neuronal data. The STA (thick line) computed from the boxcar stimulus is smoother and virtually no different in shape than that (thin line) computed from the impulse stimulus.
|
|
Rather than quantify the speed of the random motion in terms of
(the phase shift per video frame), we define a more convenient value called equivalent temporal frequency (ETF), which is the change in phase,
, divided by the change in time, 10 msec. The ETF is the TF of a grating that drifts in one direction with a phase shift of
on each video frame. Our fastest random motion stimulus had ETF = 25 Hz (1/4 cycle per 10 msec).
The advantage of using the random motion stimulus is that temporal integration can be mapped with stimuli having a variety of spatial and temporal structure, allowing us to determine how the operation of the visual system changes when it is confronted with different visual contexts. We will examine results from experiments in which one of three parameters varied: the ETF of the motion, the SF of the grating, or the contrast of the grating. In each experiment, the random motion stimuli were presented in trials of duration 20-40 sec and separated by 2 sec of mean gray. Motion on each trial was governed by a different random sequence.
Data analysis
Representation of stimulus and response. We represented the spike trains and the visual stimuli as discrete functions of time at a 1 msec resolution. The spike trains were 1 when a spike occurred and 0 everywhere else. We defined two representations of the random motion stimulus. The impulse representation uses positive and negative impulses to represent the displacements of the grating between frames (Fig. 1 B). The boxcar representation is generated by convolving the impulse representation with a 10 msec wide boxcar function, effectively replacing each impulse by a 10 msec boxcar centered on the impulse (Fig. 1C). The boxcar representation has only two values, a positive value for preferred motion and a negative value for antipreferred motion. When these values are defined to be 1 and -1, the stimulus is normalized (it has mean of 0 and variance of 1) and it may be thought of as a normalized velocity signal.
Computation of the STA. We used the method of spike-triggered averaging (De Boer and Kuyper, 1968
) to quantify the relationship between the spike train and the motion in the random stimulus. The STA was computed by averaging together fixed-length segments of the stimulus that preceded each spike. Each stimulus segment was aligned to the time of the spike, defined as t = 0. This is equivalent to computing the cross-correlation function for the spike train and the stimulus and examining only the half for which the stimulus precedes the spikes. With the normalized boxcar representation of the stimulus, the STA ranged from -1 to 1.
We used the STA to estimate the temporal profile of motion integration; therefore, it is important to consider how the choice of stimulus representation impacts the shape of the STA. Using the boxcar stimulus essentially convolves the STA with a boxcar function. For example, consider a system that sums the motion stimulus within a rectangular window of width 20 msec and fires spikes at random at a rate proportional to this sum. This system has a known window of integration: a 20 msec wide rectangle. The STA computed with the impulse stimulus accurately reflects the rectangular structure of the integration window (Fig. 1 D, thin line), whereas the STA computed from the boxcar stimulus (thick line) has sloped ends, reflecting the boxcar convolution. However, the latter STA is smoother and its basic features (e.g., its height and width) are very similar to those of the STA computed from the impulse stimulus. When a more realistic, rounded window is used, there is almost no difference in shape between the STAs computed with the impulse and boxcar stimulus representations. This is demonstrated in Figure 1 E for neuronal data. The STA computed with the boxcar stimulus (thick line) is a smoother version of the STA computed with the impulse stimulus (thin line).
In summary, we will use the boxcar representation to compute STAs because of the smoothing that it offers, but with the understanding that it cannot resolve features at a temporal resolution below 10 msec. Furthermore, this representation allows us to interpret the vertical axis of the STA as a scaled probability. In particular, the probability that the movement that occurred closest to time t was preferred is 1 if STA(t) = 1 and 0.5 if STA(t) = 0. It is worth noting that the STA does not by itself provide information about the firing rate; it simply reveals the probability that the stimulus moved in the preferred direction at various times before a spike.
To quantify the shape of STA peaks, we computed the height and the width at half-height for each peak that met a statistical criterion. We called a peak in the STA significant if the average value across any 40 msec window within the epoch from -200 to 0 msec was at least five times the SD computed for the 20 nonoverlapping 40 msec windows in the period from -1000 to -200 msec. This criterion not only ensured that we analyzed STA peaks that were statistically significant, but also that the peaks were substantial enough to provide accurate measurements of peak height and width. We found that broad STA peaks were, in general, noisier because by necessity they either had low amplitudes or arose at lower firing rates. Therefore, we convolved such STAs with a Gaussian (SD, 4 msec) if their width at half-height, determined after smoothing, was >40 msec. This removed high-frequency noise that was not removed by the boxcar smoothing because of the sharp edges of the boxcar function.
We examined STAs in the frequency domain by computing the Fourier transform (FT) of the STA. We used STAs that were 512 msec long, centered at t = -80 msec, and multiplied them by a Gaussian window (mean, -80; SD, 80 msec) to suppress noise at the tails. The high-frequency cutoff was defined to be the point at which the amplitude of the FT of the STA fell to half of its maximum value. If the amplitude at low frequencies also fell to less than half of the maximum, the STA was classified as bandpass.
Information theoretic calculation. We used a modification of the direct information theoretic technique of Liu et al. (2001
) to compute how much entropy was shared by the random motion stimulus and the response. Given a particular stimulus sequence, we used the STA to estimate the probability of a spike at a given time rather than computing that probability directly from the raw spike trains. Specifically, we estimated the mutual information between a segment of the stimulus and the predicted neuronal response in a 1 msec bin at time to relative to the beginning of the stimulus segment. The stimulus segment was made long enough to include the region of the STA that differed substantially from zero, its length being T = m
t, where m is the number of stimulus movements (typically 16) and
t = 10 msec. The mutual information (Cover and Thomas, 1991
) between the stimulus X and the response Y is as follows:
 | (1) |
 | (2) |
where
is the binary entropy function [
], pi is the probability of a spike at to given stimulus si (i = 1,..,n = 2m), and
is the mean of pi over all i. The spike probability for si is computed from the response, ri, to the stimulus:
 | (3) |
where STA(t) is the STA. The probability of a spike is determined by:
 | (4) |
where... + is half-wave rectification, rthresh is the rectification threshold, and
is set to match the mean firing rate of the response from which the STA was computed. We estimated rthresh, which is a non-decreasing function of STA amplitude, using an iterative, binary search until a value was found for which the resulting STA amplitude for the model (Eq. 4) was within 1% of that observed for the neuronal STA. We applied the method described in Equations 1-4 to compute I(X;Y) for increasing to until an asymptote was reached [asymptoting behavior is described by Liu et al. (2001
), their Fig. 15]. We averaged the value of I(X;Y) in a 10 msec period after the asymptote was reached (typically 160-170 msec after the start of the stimulus segment). This value, in units of bits, was converted into a rate by dividing by the 1 msec bin size, and this was then divided by the spike rate to produced a value with units of "bits/spike" (Liu et al., 2001
). Care is required when interpreting this value. It does not capture any information that might be present in the temporal relationships between spikes because we have estimated only the information between the stimulus and the occurrence (or not) of a spike in a single time bin. However, at low spike rates (sometimes <1 spike/sec), there is no practical way to estimate such relationships. We will use this value only for comparing stimulus conditions to each other based on the shape of the STAs.
We computed the STA implied by Equation 4 and verified that it matched the STA from the neuronal responses. For several cells and stimulus conditions with very high firing rates, we were able to compare the results of our STA-based method to results computed directly from the spike trains using the method of Liu et al. (2001
). We found these results matched very closely for short stimulus segments (six to eight letters, or 60-80 msec), which were the only ones for which the spike-train method had a negligible bias.
Simulation of a spiking motion detector
We compared the neuronal STAs with those produced by a common model for cortical direction selectivity, the ME model (Adelson and Bergen, 1985
). The first stage of the model was a pair of oriented, linear space-time filters constructed as the product of Gaussians and sinusoids (Grzywacz and Yuille, 1990
):
 | (5) |
where r = (x,y) is the spatial position vector, t is time, fr is the SF, ft is the TF,
r is the SD of the spatial Gaussian,
t is the SD of the temporal Gaussian, and n = (cos
, sin
) is the normal vector defining the spatial orientation and direction of the sinusoid in terms of the angle
. The real and imaginary parts of Equation 5 represent the two quadrature filters of the complex DS cell model. We will use G+ to refer to the quadrature filter pair for the preferred direction of motion and G- to refer to the filter pair for antipreferred motion, which is derived by replacing n with -n.
The square of the modulus of the convolution of the input image intensity, I(r,t), with the filters yields the local motion energies in the preferred and antipreferred directions in space and time:
 | (6) |
 | (7) |
The responses in time for the preferred and antipreferred motion detectors located at the center of the image, r = (0,0), are, respectively, as follows:
 | (8) |
 | (9) |
The motion model was simulated on a discrete grid (32 x 32 pixels) with a spatial resolution of 0.2°/pixel and a temporal resolution of 2 msec. The temporal dimension was matched to the duration of the stimulus being tested. The parameters of the motion filters were set to match a typical V1 complex DS neuron as follows:
r = 0.18°,
t = 15 msec, fr = 1.25 cycles/degree, ft = 10 Hz,
= 0. The input image sequence, I(r,t), had luminance values ranging from 0 to 1, corresponding to the maximum and minimum luminance in our stimuli, and was modulated in space and time to mimic the visual stimuli presented to the neurons.
Outputs of the ME computation served as inputs to an IF neuronal model. The intracellular voltage, V, of the model neuron obeyed the equation:
 | (10) |
where C is the membrane capacitance, Vex and Vin are reversal potentials for the excitatory and inhibitory conductances, gex and gin, respectively, and Vrest is the reversal potential for the leak conductance, gleak. When V reached Vthresh, a spike was discharged and V was set to Vreset. To implement a refractory period, V was held at Vreset for 1.5 msec after each spike. The excitatory and inhibitory conductances were proportional to p(t) and a(t) (Eq. 9) with added noise as follows:
 | (11) |
 | (12) |
where np(t) and na(t) were Gaussian filtered (SD, 1 msec), Gaussian white noise (mean, 20 nS; SD, 5 nS), and cex = 0.35 nS and cin = 1.0 nS [the ME outputs p(t) and a(t) are unitless]. Values of the other parameters were C = 500 pF, Vex = 0, Vin= -70 mV, gleak = 75.0 nS, Vleak= -73.6 mV, Vthresh = -52.5 mV, and Vreset = -56.5 mV (values of Troyer et al., 1998
). The voltage equation was simulated using a fifth-order Cash-Karp Runge-Kutta method with adaptive step size (Press et al., 1992
).
We chose to use an opponent model in which inhibition from an antipreferred motion opposes the excitation from preferred motion because of observations that an antipreferred motion has a suppressive influence on the neuronal response. In particular, an antipreferred motion often suppresses spontaneous firing, and it delays the subsequent responses to preferred motion (Bair et al., 2002
). We performed some simulations on the IF model in isolation by explicitly manipulating gex(t) while holding gin(t) = 0. We generated gex(t) as a binary random sequence like that shown in Figure 1C for a particular mean and SD. Negative conductance values, if they occurred, were set to zero. STAs were computed from 6 min of simulated time.
 |
Results
|
|---|
We examined the temporal integration of motion in 48 complex DS cells in V1 and 40 DS cells in MT. For each population, Table 1 summarizes some commonly reported response measures and RF properties derived from our standard characterization of each cell. After determining the preferences of each cell for drifting sinusoidal gratings, we assessed the temporal profile of the RF using our random motion stimulus, and we quantified how temporal integration changed with three stimulus parameters: speed, SF, and contrast. Not all parameters were varied for each cell (numbers are given below). After describing the neuronal data, we applied the same techniques to characterize a widely used model of motion detection and a simple mechanism of spike generation.
Stimulus speed and neuronal integration time
To test the dependence of integration time on speed, we presented our random motion stimulus at various step sizes,
, ranging from 1/1024 to 1/4 cycle, in octave steps, while holding the SF of the grating at the optimal value for the cell. The phase shifts for this range of
correspond to temporal frequencies from 0.1 to 25 Hz, which we will refer to as equivalent TFs, or ETFs. The speed of a moving grating equals TF/SF, so for a typical SF of 1 cycle/degree (Table 1), the range of speeds tested was 0.1-25°/sec. For human observers, scrutiny was required to detect motion in the slowest of the nine stimuli, whereas the fastest stimulus appeared blurry and of lower contrast because of its rapid motion.
We recorded responses from 31 V1 complex DS cells and 21 MT cells for these stimuli and assessed the profile of temporal integration by calculating the STA, which is the average of all stimulus segments that preceded a spike. For an example V1 cell, the STAs for a subset of the ETFs are shown in Figure 2A. The STA for the fastest motion (ETF, 25 Hz) had the tallest and narrowest peak (thin solid line), and as the ETF was decreased, the STA height decreased. The inset in Figure 2A plots the mean firing rate for each ETF. The filled circles in the inset correspond to the six STAs shown in the main panel (arrows mark correspondence for two cases). Two important facts are immediately obvious. First, the STA peaks are not scaled versions of each other; the peaks are wider and extend further back in time for slower motion. Second, the variation in the shape of the STA is not simply related to the change in mean firing rate. For example, the firing rates for ETFs of 1.6 and 25 Hz were the same, yet the STAs differed markedly (Fig. 2A, open and filled arrows). A set of STAs for an MT cell is shown in Figure 2B. Progressing from the fastest to the slowest stimulus, the firing rate decreased steadily (Fig. 2B, inset), but the STA width increased only after the ETF dropped below <1 Hz (e.g., STA at open arrow). The STA peak heights for this cell varied little with stimulus speed when compared with the example in Figure 2A. Thus, the STA peak for slow motion was both wide and tall (Fig. 2B, thickest solid line). Such a peak indicates that the discharge of a spike typically required the occurrence of several consecutive preferred movements. These two examples represent a range of behavior that was observed in both V1 and MT, and they do not represent systematic differences between these two areas.

View larger version (29K):
[in this window]
[in a new window]
|
Figure 2. STAs as a function of stimulus ETF. A, STAs for six ETFs are plotted for an example V1 complex DS cell. In the inset, the mean firing rate is plotted for all nine ETF values tested, and the dashed line shows the spontaneous rate. In the main panel, STAs are shown only for the solid points in the inset. The arrows mark corresponding points and STA curves. The line style legend (square box) shows the progression from low to high ETF. B, STAs formatted as in A are shown for an example MT cell. Here and in A, STAs are wider for slower motion (i.e., lower ETF). C, The average width at half-height of the STA peak is plotted against ETF for 31 V1 cells and 21 MT cells. The error bars show SEM. D, The average height of the STA peak is plotted against ETF. E, The average of the product, height x width, is plotted against ETF. F, The asymptotic value of the mutual information between the stimulus and the response (calculated from the STA) in a 1 msec bin was divided by the bin width, and this measure was then divided by spike rate (see Materials and Methods). G, The mean firing rate in excess of the spontaneous rate is plotted against ETF.
|
|
The patterns observed in the examples suggest that the temporal integration of motion reflected by the width of the STAs was not constant but varied with stimulus speed. Specifically, for rapidly moving stimuli, the spike of a DS cell signaled the occurrence of preferred motion in a relatively recent and brief time window. For slow motion, however, a spike from the same neuron signaled preferred motion over a longer time window that extended farther into the past. In other words, the broadening of the STA peak resulted from a substantial leftward shift of the left side of the peak, whereas the right side shifted much less (as seen in Fig. 2B) and often appeared to simply lean more to the left (Fig. 2A). This behavior occurred in all of the cells that we studied. A less consistent feature of STAs was a negative lobe to the left of the positive peak (Fig. 2A, thinnest line). This dip indicates that motion in the antipreferred direction facilitated the response to subsequent preferred motion. The dip is associated with transient responses to preferred motion and can arise from various mechanisms, including spike rate adaptation and synaptic depression, both of which can make responses more transient. In an extreme case in which a cell fires only transiently at the onset of preferred motion, the dip and the positive peak can have equal area (data not shown). However, the dips were almost always small compared with the positive peak, so we will focus on the latter.
To quantify the most prominent changes in the STA peaks across our database, we computed the height and the width (at half-height) for each STA peak that was >5 SDs above the noise (see Materials and Methods). Figure 2C shows that the mean peak width changed as a function of ETF for V1 and MT, increasing from
20 msec for fast motion to 50-70 msec for slow motion. The data for V1 and MT followed a common trend for ETFs of 1 Hz, but for slower stimuli, the STAs for MT were broader than those for V1. The difference between V1 and MT at slow speeds may be greater than it appears, because more V1 than MT cells failed to meet our STA peak criterion at low ETF. For example, at the second lowest ETF, 20 of 21 STAs met the criterion for MT, whereas only 16 of 31 did for V1. At the lowest ETF, this dropped to 13 of 21 for MT and 12 of 31 for V1. The average height of the STA peaks is plotted in Figure 2D. Peaks were tallest, on average, at an ETF of 12.5 Hz (Fig. 2D) and decreased by approximately half at the lowest speeds for both V1 and MT. To verify that curves in Figure 2, C and D, were not affected by having fewer cells at the lowest ETFs, we recomputed the curves using data from only those cells that had significant STA peaks at all ETFs. These curves were very similar to those shown.
Taller STA peaks indicate that, given the occurrence of a spike, the direction of motion is known with greater probability, and wider peaks indicate that the direction can be estimated over a longer epoch. Therefore, to get an approximate estimate of how informative a single spike was about the recent stimulus motion, we multiplied the peak width and height to approximate the area of the STA peak. Figure 2E shows that the area was largest at rather low ETFs, from 0.5 to 1 Hz. At the lowest ETFs, the area decreased sharply in V1 (Fig. 2E, thick line) because the STA peaks collapsed without increasing in width. In MT, however, the average peak area (for the 62% of cells that still met the 5 SD criterion) remained near its maximum value at the lowest ETF. These results imply that under conditions of very slow change in the visual image the occurrence of a single spike can convey a substantial amount of information. To make a more rigorous estimate of the information conveyed by a single spike, we used the entire STA (from t = -180 to -20 msec), not just the positive peak, as a basis for computing the mutual information between a spike and the pattern of movements in the stimulus (see Materials and Methods). These mutual information values were strongly correlated to the STA area values (r = 0.85 for V1; r = 0.62 for MT; p < 0.0001 for both). The mean mutual information, expressed in bits per spike, is plotted in Figure 2F as a function of ETF. Because the information metric favors height over width in the STA peak, the maximum values on these curves lie at higher ETFs (for which STA peaks are taller) compared with the maximum values on the area curves. Nevertheless, the trend in both sets of curves differed strikingly from that for the mean firing rate (Fig. 2G), which peaked near the highest ETFs and dropped off sharply at medium to low ETFs. Thus, a single spike often conveyed as much or more information about the motion of the stimulus under conditions in which the stimulus was substantially suboptimal for the cell in terms of firing rate.
In summary, for all DS cells that we studied, the STA peak grew wider by spreading back in time for slower moving random stimuli. This indicates that the effective integration time of the computations underlying DS responses in cortex changes with stimulus conditions. The integration time for the slowest motion that we tested was, on average, approximately two to four times longer than that for the fastest motion.
Spatial frequency and integration time
We tested whether the temporal integration of motion also depended on SF by presenting the random motion stimulus at a variety of SFs, including values well below and well above the optimal SF of each cell (determined for smoothly drifting gratings). We held the step size,
, constant at 1/8 of the spatial period. Thus, the ETF remained constant at 12.5 Hz, which was optimal or near optimal in terms of the height of the STA peak (Fig. 2D) and evoked firing rate (Fig. 2G) for most cells.
Figure 3A shows how the STA shape changed with changes in SF for an example V1 complex DS cell that had an optimal SF of 3 cycles/degree. For clarity, STAs are plotted for only six of the nine SFs tested (Fig. 3A, line style legend below panel shows progression from low to high SF). Progressing from the lowest to the optimal SF, the STA peak remained narrow and the peak height increased somewhat (Fig. 3A, progression from thin solid line to the line of shortest dashes). For SFs above the optimal, the STA peak grew substantially wider (thick dashed and thick solid lines). An example MT cell shows similar trends (Fig. 3B), except that the STA peak for very low SF (thin solid line) was broader than the peaks for near-optimal SFs (dashed lines). The STA peak for the highest SF (thick solid line) was again substantially broader and shifted to the left compared with STAs for near-optimal SFs. In fact, for all but one cell (an MT cell), the STA peak was wider at the highest SF compared with the optimal SF.

View larger version (30K):
[in this window]
[in a new window]
|
Figure 3. STAs as a function of grating SF. The format is the same as in Figure 2, except that SF is changing. A, STAs and firing rate for an example V1 cell. The icons in the spike rate inset show the visual stimuli drawn to scale for the lowest and highest (left and right, respectively) SFs that had STA peaks above the noise. High SF consistently yielded broad STAs (thickest line). The legend box between A and B shows the sequence of line styles from low to high SF. The optimal SF for drifting gratings (3 cycles/degree) corresponds to the short-dashed line. B, STAs and firing rate for an example MT cell. The optimal SF for drifting gratings corresponds to the thin long-dashed line. C-G, Same format as Figure 2, except the horizontal axis is SF. The actual SF values tested for each cell were typically not exactly those indicated by the points plotted here. The x coordinates used here are the average SF values for all points that fell in a cluster. The error bars show SEM.
|
|
The average width at half-height of the STA peak is plotted against SF in Figure 3C for 23 V1 and 18 MT cells that we tested with various SFs (see legend for details of averaging). The width increased from
25 msec at
1 cycle/degree to
50 msec at the highest SFs that yielded significant peaks. The rightward shift of the average V1 curve relative to the MT curve was consistent with a mismatch in the RF eccentricities for the two populations. In particular, the average eccentricity for our V1 population (7°; SD, 6) was approximately half of that for the MT population (15°; SD, 12). The curves for individual cells (data not shown) typically were either U-shaped (with the minimum near the optimal SF) or were flat below the optimal frequency and increasing above the optimal. This accounts for the somewhat U-shaped average curve for MT (Fig. 3C, thin line). We also aligned the curve for each cell to its optimal SF before averaging across cells, but this yielded curves (data not shown) very similar to those shown. Overall, the width of the STA peak was determined partially by the absolute SF (high SFs yielded wider peaks) and partially by the relative SF (SFs near optimal or within a factor of 2 lower than optimal had narrower peaks). The STA height (Fig. 3D) was largest in the middle of the range of SFs (near the optimal SFs) tested in V1 and MT and dropped off somewhat at high and low SFs.
The estimated peak area (width x height) (Fig. 3E) was greatest, on average, for higher than optimal SFs, particularly for V1 (thick line), where the mean STA area was greatest at the highest SFs tested. The mutual information curves in Figure 3F showed similar trends: the mean information was higher at higher SFs, except for a drop at the highest SFs tested in MT. For V1, there was a striking divergence at high SF between the mean spike rate (Fig. 3G, thick line), which dropped rapidly toward zero, and the area and information curves, which continued to increase. The association of high STA area with low firing rate was noted above for variations in ETF (Fig. 2E,G), and it typically resulted from a substantial increase in the width of the STA peak and a modest decrease in height. Sometimes, however, increased area was caused mainly by an increase in height. Striking examples of this are shown in Figure 4, A and B, for a V1 and MT cell, respectively. In Figure 4A, the STA grew monotonically to its upper limit of 1 (
-50 to -40 msec) and toward its lower limit of -1 (
-75 to -60 msec) as SF increased. The upper limit was achieved at a very low firing rate (0.1 spikes/sec) and for an SF (1.4 cycle/degree) that was at the upper end of the spike rate tuning curve for the cell. The saturated STA is the average of binary stimulus segments (Fig. 1C) that preceded 19 spikes. Similar behavior is shown for an MT cell in Figure 4B, in which the STA between -50 and -60 msec progressed monotonically to its upper limit at the high end of the SF tuning curve. When the STA asymptotes, a spike signifies with certainty the direction of stimulus motion. Equivalently, there are no false alarms (i.e., no spike occurred unless the movement was in the direction of the asymptote).

View larger version (31K):
[in this window]
[in a new window]
|
Figure 4. Single spikes can encode motion of high SF targets with certainty. A, For a V1 complex DS cell, seven STAs are shown for values of SF ranging from low (thin solid lines) to medium (dashed lines) to high (thick solid lines). From low to high SF, the STAs form a progression from a smoothed triangular shape (open arrow) to a flat-topped form (filled arrow) that indicates 100% certainty (1 on the vertical axis) that motion was in the preferred direction 35-55 msec preceding a spike. The negative lobe of the STA also extends downward, signaling an antipreferred motion with 86-93% probability from 61 to 73 msec before a spike. Thus, for the high SF stimulus, a very particular and precisely timed pattern of motion (almost always a motion reversal) was required to make this cell fire. B, Similar behavior is shown for STAs for an example MT cell. The STA for the highest SF tested (filled arrow) reaches its asymptote of 1 from 50 to 62 msec before a spike.
|
|
Interestingly, both examples in Figure 4 had a zero spontaneous firing rate, as did the cell in Figure 2B, which also had tall STA peaks. We therefore tested for a correlation between spontaneous firing rate and STA peak height. The height of the STA at the optimal SF for each cell did not correlate with spontaneous rate. However, the height of the STA at the highest SF was correlated with spontaneous rate in both V1 and MT (V1: r = -0.59, p = 0.003; MT: r = -0.66, p = 0.005). For cells with tall STA peaks (>0.6) at high SF, the average spontaneous rate was 0.7 spikes/sec (SD, 1.1) compared with 8.5 spikes/sec (SD, 6.3) for cells with shorter peaks (<0.6). This difference was highly significant (t test; p < 0.0001). A tall STA peak and a low spontaneous firing rate both suggest that a cell is operating in a high-threshold regime in which only the strongest barrages of stimulus-driven excitatory input elicit spikes. Below, we examine the regime that produces this behavior in an IF mechanism.
We have shown that the shape of the STA varied in two sets of experiments: one in which ETF varied while SF was fixed (Fig. 5A, vertical band of points) and one in which SF varied while ETF was fixed (Fig. 5A, horizontal gray band of points). Therefore, the shape of the STA can be neither a function of ETF nor SF alone. However, velocity, which equals TF/SF, changed in both experiments (Fig. 5A, dashed diagonals show iso-velocity contours). To test whether velocity alone accounted for the changes in the STA, we compared STAs for stimuli having the same velocity but different SFs and ETFs (Fig. 5A, open squares connected by gray line segment). For an example V1 cell, Figure 5B shows two STAs for stimuli moving at 12.5°/sec. The STA for the higher SF and ETF (thick line, thick square) was wider than that at the optimal SF and the proportionally lower ETF (thin line, thin square; see legend for stimulus parameters), although the velocity was the same. The same trend is shown for an MT cell in Figure 5C. We made this comparison for all cells tested under these two equal-velocity conditions. The STAs were wider in the high SF (and high ETF) condition for almost all cells (Fig. 5D). The average difference was 12 msec (SD, 14; n = 36; p < 0.001), and there was no significant difference between V1 and MT. This test was made along lines of relatively low velocity in spatiotemporal frequency space (Fig. 5A, gray line segment connecting squares). We performed a similar test along lines of higher velocity by comparing STAs at the highest ETF and the optimal SF to those at a lower ETF and lower SF (Fig. 5A, gray line segment connecting triangles). In this case, the STAs tended to be wider at the lower ETF and SF combination (Fig. 5E), on average, by 6 msec (SD, 9; n = 35; p < 0.01). These results demonstrate that velocity, like ETF and SF, cannot alone account for the changes in the STA, and they verify that SF plays an important and independent role in shaping the window of temporal integration. In particular, along the iso-velocity contour containing the highest SFs tested, the trend to have a longer integration time at high SF dominated the trend to have shorter integration time at higher ETF.

View larger version (33K):
[in this window]
[in a new window]
|
Figure 5. The temporal integration profile is not determined solely by stimulus velocity. A, The typical locations of stimuli in SF-TF space are plotted for our experiments in which ETF varied (white vertical band) and in which SF varied (gray horizontal band). Points are shown only for the highest five ETFs and only for SFs at full octave intervals. Diagonals are iso-velocity contours. The thick and thin open squares connected by the long gray line segment indicate, respectively, the stimulus at the highest SF and a velocity-matched stimulus at a lower SF and lower ETF. The shorter gray line segment connecting the thick and thin triangles shows a pair of velocity-matched stimuli at a faster speed. B, For a V1 cell, the STA is plotted for SF 1 cycle/degree and ETF 12.5 Hz (thick line) and SF 0.25 cycle/degree and TF 3.1 Hz (thin line). Thus, the velocity was 12.5°/sec in both cases. The thick and thin squares indicate the positions of the stimuli in the parameter space of A. C, For an MT cell, the STA is plotted for SF 4.60 cycles/degree and TF 12.5 Hz (thick line) and for SF 1.15 cycles/degree and TF 3.1 Hz (thin line). The velocity was 2.7°/sec in both cases. D, The width at half-height of the STA at the highest SF that produced a significant peak (thick square in A) is plotted against the STA width at the optimal SF tested at the same velocity (thin square in A). Nearly all points for V1 (filled circles) and MT (open circles) fell above the line. Two points for MT, one below the line and one above the line, fell outside the range shown here: (91, 113) and (115, 73). E, The STA width at the highest ETF (25 Hz) and the optimal SF (thick triangle in A) is plotted against the width at a lower ETF (typically 12.5 Hz) at the same velocity (thin triangle in A).
|
|
In summary, motion signals generated from patterns of high SF are integrated within a time window that is, on average, 20-30 msec wider than that for patterns of lower SF. This further supports the idea that individual DS neurons do not have fixed profiles of temporal integration.
Contrast and temporal integration
The luminance contrast of visual stimuli is known to affect temporal response properties in the visual system. For example, contrast influences the relative sensitivity to low and high TFs in retinal ganglion cells (Shapley and Victor, 1981
), and it affects the phase of cortical simple-cell responses to drifting gratings (Dean and Tolhurst, 1986
; Carandini and Heeger, 1994
; Albrecht, 1995
; Carandini et al., 1997
). In addition, increasing contrast increases apparent velocity (Thompson, 1982
), shortens reaction times to moving gratings (Burr and Corsale, 2001
), and decreases psychophysically inferred integration times (Müller and Greenlee, 1998
). We therefore examined the influence of contrast on motion integration in DS neurons by varying the contrast of the sinusoidal grating in our random motion stimulus while holding the SF at the optimal value for the cell and the ETF at 12.5 Hz.
For an example V1 cell, STAs are plotted in Figure 6A for four contrasts. The amplitude of the STA peak dropped rapidly with decreasing contrast, whereas the peak width changed very little. This behavior differed from that observed for the same cell when the speed (ETF) was reduced (Fig. 2A). Reducing the speed caused a marked widening of the STA, unlike reducing contrast, although both manipulations caused the firing rate to drop to near the spontaneous level. In a second example (Fig. 6B, MT cell), the amplitude also dropped with contrast, but the STA peak became substantially wider. Although the former example is from V1 and the latter from MT, the trends occurred equally in both areas. In
10% of cells in both V1 and MT, the STA peak first grew taller (as well as wider) as contrast dropped from 100% to
25-50%. Additional reductions in contrast caused a rapid decline in peak height.

View larger version (23K):
[in this window]
[in a new window]
|
Figure 6. STAs as a function of grating contrast. The format is the same as in Figure 2, except that grating contrast is changing. A, STAs and firing rate for an example V1 cell. Only four of the six contrasts tested yielded STAs with significant peaks. As contrast was reduced from 100% to 12.5% (thinnest to thickest lines), the STA peak height decreased and became slightly broader. B, STAs and firing rates for an example MT cell. As contrast was reduced, there was a substantial broadening of the STA peak in addition to a decrease in amplitude. C-G, Same format as Figure 2, except horizontal axis is contrast.
|
|
On average, lower contrast was associated with wider (Fig. 6C) and shorter (Fig. 6D) STA peaks (26 V1 and 29 MT cells). This indicates that low-contrast stimuli, like slow moving or high SF stimuli, elicited responses in V1 and MT that reflected longer temporal integration. At low contrast, however, the increase in STA width was relatively modest and the decrease in height was relatively steep compared with the changes at low ETF or high SF (Figs. 2 and 3, respectively). This difference was also reflected in the plots of STA area, which had lower maximum values for contrast (Fig. 6E) than for ETF or SF (Figs. 2E, 3E). The mutual information per spike (Fig. 2G) decreased more rapidly with contrast than did area, because the information measure favors peak height over peak width. The distinctions between these trends and those for ETF and SF are made in a direct and quantitative manner below.
Comparing STA shapes across stimulus dimensions
To compare the changes in STA shape across ETF, SF, and contrast, we replotted the data from Figures 2, 3, and 6 in a parametric plot of STA height against width (Fig. 7A). The black lines represent the ETF data for V1 (thick line) and MT (thin line). The end points of these lines in the top left region of the plot, which corresponds to tall and narrow STAs, are the points for the highest ETF (fastest motion). As ETF decreased, the width of the STA increased and the peak height decreased; therefore, the data points trace a line from the top left to the bottom right of the plot. Sliding to the bottom right, each successive point corresponds to a one-octave drop in ETF. The end points in the bottom right region of the plot (low ETF) corresponds to short, wide STAs. Next, the gray lines show the progression of STA shape from high to low contrast (top and bottom end points are 100 and 3.125% contrast, respectively). The trend with contrast was different from that for ETF: the STA height dropped rapidly as contrast decreased but the width increased less than it did for low ETF.

View larger version (28K):
[in this window]
[in a new window]
|
Figure 7. The effects on the STA shape of varying motion speed (ETF), grating SF, and grating contrast are compared. A, Data from Figures 2, 3, and 6 (C, D) are replotted in a parametric space of STA height versus width. Thick lines show data for V1 and thin lines for MT. Black lines show ETF data. Gray lines show contrast data. Red lines show SF data. Data are marked by points on lines. Low ETF marks the ends of the black lines that correspond to the slowest motion. Also labeled are the low contrast ends of the gray lines and the high SF and low SF ends of the red lines. The region marked optimal corresponds to the location of STAs for fast, high-contrast, and optimal SF stimuli. B, Database averages of the STAs for all 31 V1 complex DS cells for the two lowest ETFs (thick line) and the highest ETF (thin line). This demonstrates how the temporal profile of integration of the V1 population changes with stimulus speed. C, Averages like those in B are shown for all 21 MT cells. These averaged STAs show the dramatic change in temporal integration with motion speed for the population. It also reveals a weak negative lobe (from 200 to 400 msec before a spike) that was not easily observed in individual STAs.
|
|
Finally, in red are the data for SF, and both the low and high SF end points are labeled (Fig. 7A). As expected, the middle of the red curves, which correspond to optimal SFs, lies close to the second points on the black curves (for ETF, 12.5 Hz) and near the top points on the gray curves (100% contrast) because they all correspond to approximately the same stimulus parameters. The curves do not line up exactly (the cell populations are somewhat different, and there is noise in the parameter estimates), but they cluster closely in the region labeled "optimal." Interestingly, the high-SF legs of the red curves followed approximately the course of the black curves toward low ETF, whereas the low-SF legs tended to follow the course of the gray curves toward low contrast. The latter trend was particularly evident for the MT data (thin red line).
It is important to consider that firing rate is lowered by four manipulations here: lowering the SF, raising the SF, lowering the contrast, and lowering the ETF (in fact, raising the ETF above 12.5 Hz also lowers the firing rate). However, these manipulations do not cause the STA to follow a single trajectory in the parameter space in Figure 7A. This strongly argues against the hypothesis that the signal strength at the soma of the recorded neuron controls temporal integration. If it did, then variations in contrast should be able to achieve STAs with shapes similar to those for variation in SF or ETF, but this was not the case. For example, the dashed circles in Figure 7A mark regions on the SF and ETF curves where the evoked firing rate is equal (
6 spikes/sec above spontaneous) but STA width is different. The discrepancy between the firing rate and STA shape can also be seen for individual cells (e.g., within Fig. 2A, and between Figs. 2A and 6A, which show data from the same cell).
Across the parameter variations that we tested, the widest STAs occurred for low ETF stimuli, as indicated by the right end points of the black lines in Figure 7A. To show explicitly what the window of temporal integration looks like for DS cells operating at the slow end of their range, we averaged together the STAs for all V1 complex DS cells for the slowest two ETFs tested (0.1 and 0.2 Hz; no statistical criterion was applied to the STA peaks). The result is plotted in Figure 7B (thick line). This population STA reveals a window of positive temporal integration that extends over 120 msec, from
30 to 150 msec before the occurrence of a spike. The same type of population average for an ETF of 25 Hz (Fig. 7B, thin line) reveals a temporal profile spanning
45 msec, from 25 to 70 msec before the spike. Analogous population STAs are plotted for MT in Figure 7C. For slow motion, MT cells use information within a similar 120 msec window, but the population average also shows a clear depression that extends over at least 220 msec, from 180 to 400 msec before the spike. This suggests that slow antipreferred motion has a facilitatory effect on spiking that lasts for hundreds of milliseconds. Population averaging exposed the temporal extent of integration by removing noise in STAs for single cells, which by themselves rarely showed significant modulation this far back in time. The MT population STA for fast motion has a 30 msec window of temporal integration from 25 to 55 msec before the spike (Fig. 7C, thin line). It also has a negative lobe. Some caution is required when comparing these population averages to individual STAs because narrow STAs with large negative lobes can cancel positive portions of wider STAs. Nevertheless, the population averages for slow motion provide a less noisy look at integration at longer time scales.
The change in the STA over time
The different STA shapes observed across stimulus conditions show that the temporal integration of motion in V1 and MT cells is not fixed. However, the data presented reflect steady-state measurements (i.e., averages over the full duration of stimuli that lasted for 20-40 sec). We wondered whether changes in the STAs might reflect a slow adaptive process. To test this, we examined the width and height of the STAs computed from just the first 4 sec of the stimulus (the early period) and from a late period, 16-20 sec after stimulus onset. Windows shorter than 4 sec were impractical because they often yielded STAs that did not meet our criterion for a significant peak.
Figure 8 shows STAs in the early and late periods for slow and fast motion (ETF: 0.4 Hz at left, 25 Hz at right) for a V1 complex DS cell. The STA for the early period (Fig. 8A,C, thin solid lines) is somewhat narrower than the STA computed over the entire stimulus (dashed lines), whereas the STA for the later period (thick lines) was slightly wider. In Figure 8, we plotted the early STA width against the late STA width for individual cells for the slowest motion that yielded significant STA peaks (B) and for the fastest motion (D). The clusters of points for both V1 and MT were centered above the diagonal line, indicating that STAs in the late period were generally wider than those in the earlier period. The mean difference in width for slow motion was 9 msec for V1 (SD, 10; n = 29; p = 0.001; paired t test) and 7 msec for MT (SD, 12; n = 21; p = 0.08). For fast motion, the difference was 3 msec for V1 (SD, 5; n = 29; p = 0.02) and 4 msec for MT (SD, 7; n = 19; p = 0.14).

View larger version (34K):
[in this window]
[in a new window]
|
Figure 8. Changes in the width of the STA peak over time during the stimulus. A, For an example V1 complex DS cell, the STA for ETF 0.4 Hz is plotted for the first 4 sec of the stimulus (thin solid line), for the 4 sec period beginning 16 sec after stimulus onset (thick line), and for the entire stimulus (dashed line). B, For each cell, the width at half-height of the STA peak for the late epoch is plotted against that for the early epoch for the lowest ETF that yielded a significant STA peak. Points for both V1 (filled circles) and MT (open circles) fell mainly above the diagonal line. C, D, The same format as A and B, except the comparison is made for fast motion (ETF 25 Hz).
|
|
We compared early and late STA widths for two other conditions that were associated with wider peaks: high SF and low contrast. For high SF, the width in the late period was, on average, larger by 5 msec in V1 (SD, 5 msec; n = 23; p = 0.001) and by 1 msec in MT (SD, 5 msec; n = 16; p = 0.53). For low contrast, the average width increased by 1 msec in V1 (SD, 9; n = 26; p = 0.81) and 3 msec in MT (SD, 10; n = 29; p = 0.19). We also tested the STA peak heights in all four conditions (fast, slow, high SF, and low contrast), but there were no significant changes, on average, between the early and late epochs. Evidence that our measurements were not dominated by noise is provided by the observation that in all conditions the early values were strongly correlated to the late values (0.75 < r < 0.96; p < 0.001 in all four cases for both width and height).
Although the mean increase in peak width in the late period was modest and, in many cases, not statistically significant, we tested whether the size of the increase over time was related to the size of the increase with stimulus parameters. For each cell, we paired the change in width across time with the change between the suboptimal (low ETF, high SF, or low contrast) and optimal conditions (the latter change being computed for full trials). In no case was there a significant correlation between these paired values, suggesting that the changes in width over time were not related to the changes across stimulus parameters. Finally, we found no significant correlation between the change in firing rate and the change in STA width during the trial.
In summary, integration time increased slightly, on average, during the trial for both wide and narrow STAs. We found no evidence to link the changes over time to the changes observed with stimulus parameters on a cell-by-cell basis. These results are consistent with the idea that changes in STA shape observed across stimulus parameters occurs predominantly on a time scale shorter than several seconds. However, better estimates of the rate of change of temporal integration may require a deterministic adapt-and-test paradigm that uses a test stimulus that is brief compared with the duration of the random input required for estimating the STA.
Frequency domain analysis of the STA
To facilitate the comparison of our results to the extensive frequency domain analysis of cortical RFs, we computed the FT of the STAs (see Materials and Methods) and plotted Fourier amplitude as a function of frequency. Here, we focus on the high-frequency cutoff and the distinction between low-pass and bandpass behavior, whereas in the time domain, we focused solely on the positive STA peak. We refer to an STA as bandpass if the amplitude of its FT fell below half of its maximum value at low frequencies. We expect narrower STAs to have higher cutoff frequencies and STAs with large dips (negative lobes to the left of the positive peak) to be bandpass.
Figure 9A shows the amplitude spectra for three of the STAs for the example cell from Figure 2A. For fast motion (ETF, 25 Hz) (Fig. 9A, thin line with dots), the STA spectrum is predominantly low-pass but drops somewhat at low frequency. For the slowest motion (ETF, 0.5 Hz) (Fig. 9A, thickest line), the spectrum is low-pass and the cutoff frequency (open circle) is substantially lower. Across the population at ETF 25 Hz, only 5 of 31 V1 cells (16%) and 6 of 21 MT cells (29%) were bandpass. All but one of these cells became low-pass at low ETFs. The average peak frequencies of the bandpass STAs were 11 and 13 Hz (SD, 1 Hz) for V1 and MT, respectively. For individual cells, the high-frequency cutoff for slow motion is plotted against that for fast motion in Figure 9B. The average cutoff frequency dropped from 19 to 6 Hz for V1 cells and from 21 to 7 Hz for MT cells as the stimulus ETF changed from its highest to its lowest value. This behavior in the frequency domain corroborates the large and systematic changes observed in the STAs in the time domain. A shift to lower frequencies in the temporal spectrum of the response may seem like a natural consequence of lowering the stimulus ETF; however, such a shift is not predicted by a standard model for motion detection (see below).

View larger version (28K):
[in this window]
[in a new window]
|
Figure 9. Characterization of STAs in the frequency domain. A, For the example V1 cell in Figure 2A, the amplitude of the FT of the STA is plotted for ETF 0.2, 3.1, and 25 Hz (thick, medium, and thin lines, respectively). The open circles mark the cutoff frequency, defined to be the frequency at which the amplitude drops to half of the maximum value. B, The cutoff frequency for low ETF is plotted against that at ETF 25 Hz for V1 and MT cells (filled and open circles, respectively). The mean cutoff values for V1 (n = 31) were 6 Hz (SD, 2) and 19 Hz (SD, 5) for low and high ETF, respectively, and for MT (n = 21) were 7 Hz (SD, 3) and 21 Hz (SD, 10). C, For the example V1 cell in Figure 3A, the FT of the STAs are plotted for three SF values: 1.0, 2.0, and 7.9 cycles/degree (thin, medium, and thick lines, respectively). D, Cutoff frequency at | |