Neuronal interactions are an intricate part of cortical information processing generating internal representations of the environment beyond simple one-to-one mappings of the input parameter space. Here we examined functional ranges of interaction processes within ensembles of neurons in cat primary visual cortex. Seven “elementary” stimuli consisting of small squares of light were presented at contiguous horizontal positions. The population representation of these stimuli was compared to the representation of “composite” stimuli, consisting of two squares of light at varied separations. Based on receptive field measurements and by application of an Optimal Linear Estimator, the representation of retinal location was constructed as a distribution of population activation (DPA) in visual space. The spatiotemporal pattern of the DPA was investigated by obtaining the activity of each neuron for a sequence of time intervals. We found that the DPA of composite stimuli deviates from the superposition of its components because of distance-dependent (1) early excitation and (2) late inhibition. (3) The shape of the DPA of composite stimuli revealed a distance-dependent repulsion effect. We simulated these findings within the framework of dynamic neural fields. In the model, the feedforward response of neurons is modulated by spatial ranges of excitatory and inhibitory interactions within the population. A single set of model parameters was sufficient to describe the main experimental effects. Combined, our results indicate that the spatiotemporal processing of visual stimuli is characterized by a delicate, mutual interplay between stimulus-dependent and interaction-based strategies contributing to the formation of widespread cortical activation patterns.
- neural ensembles
- neural field
- optimal linear estimator
- population code
- population dynamics
- receptive field
- striate cortex
- visual field
During the recent years neurons of the visual cortex have been extensively investigated according to a diversity of feature attributes. In search of optimal stimulus conditions, they were classified with respect to differing receptive field (RF) properties. However, RFs can exhibit complex, nonpredictive behavior dependent on further variations of the stimulus parameters. In addition, these complex spatiotemporal response properties can be modified by stimulation displaced from the RF center or from outside the classical RF (Allman et al., 1985; Dinse, 1986; Gilbert and Wiesel, 1990; Sillito et al., 1995). These observations were explained with results from anatomical and physiological studies revealing extensive long-range horizontal intracortical connections (Fisken et al., 1975;Creutzfeldt et al., 1977; Gilbert and Wiesel, 1979, 1990;Kisvárday and Eysel, 1993; Bringuier et al., 1999). Accordingly, optical imaging techniques demonstrated that the cortical processing of even very small objects is associated with a widespread pattern of cortical population activation (Grinvald et al., 1994; Godde et al., 1995).
Neural population analysis refers to the notion that large ensembles of neurons contribute to the cortical representation of sensory or motor parameters. Early formulations of this idea (Erickson, 1974) conceived of the representation of complex stimuli in terms of elementary feature detectors simply as a combination of the simultaneous levels of their activation. In primary motor cortex, ensembles of neurons broadly tuned to the direction of movement have been shown to accurately represent the current value of that parameter (Georgopoulos et al., 1986, 1993). These observations inspired renewed attempts to investigate sensory representations in terms of population codes (Steinmetz et al., 1987;Lee et al., 1988; Vogels, 1990; Young and Yamane, 1992; Wilson and McNaughton, 1993; Nicolelis and Chapin, 1994; Ruiz et al., 1995; Jancke et al., 1996; Kalt et al., 1996; Zhang, 1996; Groh et al., 1997;Sugihara et al., 1998; Zhang et al., 1998) and triggered theoretical work examining the formal basis of coding by populations of neurons (Gielen et al., 1988; Vogels, 1990; Zohary, 1992; Gaal, 1993a,b; Seung and Sompolinsky, 1993; Anderson, 1994a,b; Salinas and Abbott, 1994; Giese et al., 1997; Pouget et al., 1998; Zemel et al., 1998;Zhang et al., 1998).
In this paper we studied how small visual stimuli can be represented by the joint activation of a population of neurons in cat primary visual cortex and how neurons within such a population interact in terms of a common metric dimension, in our case, in visual space.
In a first step, we attempted to extract the contribution of neurons to the representation of the location of small squares of light, which we called “elementary” stimuli (Fig.1 A). We therefore constructed distributions of population activation (DPAs) defined in the visual field that can be regarded as a subspace of the potentially high-dimensional space of visual stimulus attributes. The second step consisted of projecting the neural responses to “composite” stimuli assembled from two squares of light at varied separations (Fig.1 B) onto this subspace by analyzing DPAs weighted with the responses to composite stimuli. Distance-dependent deviations of the DPAs from the superposition of the corresponding elementary components reveal insight into interaction processes within the representation of retinal location at the population level. Such interaction may arise from recurrent connectivity within the cortical area as well as from recurrence within the network providing the sensory input. A neural field model explicates how such mechanisms contribute to the evolution of cortical activation within ensembles of neurons.
MATERIALS AND METHODS
Animals and preparation. Electrophysiological recordings from a total of 178 cells were made extracellularly in the foveal representation of area 17 in 20 adult cats of both sexes. Animals were initially anesthetized with Ketanest (15 mg/kg body weight, i.m.; Parke-Davis, Courbevoie, France) and Rompun (1 mg/kg, i.m.; Bayer, Wuppertal, Germany). Additionally, atropin (0.1 mg/kg, s.c.; Braun) was given. After intubation with an endotracheal tube, animals were fixated in a stereotactic frame. During surgery and recording, anesthesia was maintained by artificial respiration with a mixture of 75% N2O and 25% O2 and by application of sodium pentobarbital (Nembutal, 3 mg · kg−1 · hr−1, i.v.; Ceva). Treatment of all animals was within the regulations of theNational Institution of Health Guide and Care for Use of Laboratory Animals (1987). Animals were paralyzed by continuous infusions of gallamine triethiodide (2 mg/kg, i.v. bolus; 2 mg · kg−1 · hr−1i.v., Sigma, St. Louis, MO). In addition, 5% glucose in physiological Ringer’s solution was continuously infused (3 ml/hr; Braun). Heart rate, intratracheal pressure, expired CO2, body temperature, and EEG were monitored during the entire experiment. Respiration was adjusted for an end-tidal CO2between 3.5 and 4.0%. The body temperature was kept at 37.5°C by means of a feedback-controlled heating pad. Contact lenses with artificial pupils (3 mm diameter) were used to cover the eyes, which were frequently rinsed with artificial eye liquid (Liquifilm; Pharm-Allergan). Pupils were dilated by atropine (5 mg/ml), and nictitating membranes were retracted by norepinephrine (Neosynephrin-POS, 50 mg/ml; Ursapharm). The bone and dura mater were removed over the central representation of area 17 in the left hemisphere. The exposed cortex was covered with heavy silicone oil. At the end of the experiments, animals were killed with an overdose of sodium pentobarbital.
Data acquisition. We recorded responses of single units in the foveal representation in area 17 of the left hemisphere. Stimuli were always presented to the contralateral eye. Recordings were performed simultaneously with two or three glass-coated platinum electrodes (resistance between 3.5 and 4.5 MΩ; Thomas Recording), which were advanced with a microstepper. The bandpass-filtered (500–3000 Hz) electrode signals were fed into spike sorters based on an on-line principle component analysis (Gawne and Richmond, National Institutes of Health, Bethesda, MD). Their output TTL-pulses were stored on a personal computer (PC) with a time resolution of 1 msec. Raw analog recordings were displayed on oscilloscopes and on audio monitors. Digitized neural responses were displayed as poststimulus time histograms (PSTHs) on-line during the recording sessions.
Data were analyzed off-line in the Interactive Data Language graphical environment (Research Systems, Inc.).
Visual stimulation. Stimuli were displayed on a PC-controlled 21 inch monitor (120 Hz, noninterlaced) positioned at a distance of 114 cm from the animal.
An identical set of common stimuli was presented to all neurons: (1) elementary stimuli (Fig. 1 A), small squares of light (size, 0.4 × 0.4°), were flashed at one of seven different horizontally contiguous locations within a fixed foveal reference frame; and (2) composite stimuli (Fig. 1 B), two simultaneously flashed squares of light, were separated by distances that varied between 0.4 and 2.4°. Each stimulus was flashed for 25 msec. The interstimulus interval (ISI) was 1500 msec. There were a total of 32 repetitions of each stimulus, arranged in pseudorandom order across the different conditions. Stimuli had a luminance of 0.9 cd/m2 against a background luminance of 0.002 cd/m2. The retinal position of these common stimuli was constant, irrespective of the RF location of individual neurons (non-RF-centered approach illustrated in Fig.1 C,D4).
The profile of each individual RF was assessed quantitatively with a separate set of stimuli, consisting of small dots of light (diameter, 0.64°) that were flashed in pseudorandom order (20 times) for 25 msec (ISI, 1000 msec) on the 36 locations of an imaginary 6 × 6 grid, centered over the hand-plotted RF (response plane technique, Fig.1 D1). To control for eye drift, RF profiles were repeatedly measured during each recording session.
Construction of the DPA
The general idea behind constructing a population distribution is to extract the contributions of neurons to the representation of a particular stimulus parameter. To obtain entire distributions that are defined for visual field location, two types of analysis were applied: (1) based on the measured RF profiles (Fig. 1 D1,D2), the calculated RF centers (Fig. 1 D3) served to construct two-dimensional DPAs by interpolating the normalized firing rates of each contributing neuron with a Gaussian profile (cf.Anderson, 1994a,b, for a related attempt) (Fig.1 E,F); and (2) to minimize the reconstruction error for the elementary stimulus conditions, we extended the Optimal Linear Estimator (OLE) (Salinas and Abbott, 1994), resulting in one-dimensional DPAs (Fig.2 C).
Constructing two-dimensional DPAs by Gaussian interpolation.For each location on the 6 × 6 grid, an average response strength was determined for each cell by averaging the firing rate in the time interval between 40 and 65 msec after stimulus onset corresponding to the peak responses in the PSTHs. RF profiles were obtained (Fig.1 D2) and smoothed by convolution with a Gaussian profile in two dimensions (half width, 0.64°; Fig.1 D3). The center of the RF of each cell was then computed as the center of mass of that part of the RF profile that exceeded half of the maximal firing rate.
The firing rate, f n(s,t) of neuron number n to stimulus number s was defined as the firing rate in a 10 msec time interval beginning at timet after stimulus onset, averaged over 32 stimulus repetitions. Spontaneous activity, b n, was estimated as the mean firing rate accumulated over nonstimulus trials. For the purpose of constructing the population representation, the firing rate of each cell was normalized to its maximum firing rate,m n, over all stimuli used to measure the response planes and during any single 10 msec bin in the time interval from stimulus onset to 100 msec after stimulus onset. This normalized firing rate: Equation 1was always well defined and positive (Fig. 1 D4).
The normalized firing rates,F n(s,t), were depicted at the position of the calculated RF center of each neuron. For interpolation of the data points, the width of the Gaussian profile was chosen equal to 0.6° in visual space (approximately corresponding to the average RF width of all neurons recorded) (Fig.2 A). To correct for uneven sampling of visual space by the limited number of RF centers, the distribution was normalized by dividing by a density function, which was simply the sum of unweighted Gaussian profiles (width, 0.64°) centered on all RF centers. This procedure is illustrated in Figure 1, E andF.
Deriving the optimal linear estimator for the DPA. An optimal estimation of the DPA is based on the responses to elementary stimuli. For each stimulus positions i, the DPA,Û i(s k), is constructed as a linear combination of contributions from each neuron (n = 1, … , N): Equation 2The number M of sample pointss k determines the degree of resolution with which the DPAs are sampled. The contribution of each neuron is a basis function,c n(s k), to be determined by optimization, multiplied with the firing rate,f n(s i), averaged over the time interval between 40 and 65 msec after stimulus onset. The desired form of the DPA representation of these stimuli is explicitly chosen as a Gaussian,U i(s k), centered on each stimulus position,s i: Equation 3The width ς = 0.6° was chosen such thatU i(s k) fits to the average RF profile of all measured neurons (Fig.2 A). To determine the basis functions we minimize the average reconstruction error Σi(Ûi (s k) −Ui (sk ))2(Seung and Sompolinsky, 1993; Salinas and Abbott, 1994; Pouget et al., 1998), which leads to: Equation 4Here, Qnm is the correlation matrix between the firing rates of neurons n and m for all stimuli: Equation 5and Lm(sk ) is: Equation 6This amounts to an OLE for a vector-valued stimulus parameter (Salinas and Abbott, 1994).
This estimator can then be extrapolated to obtain time-resolved DPAs by replacing the averaged firing ratef n(s i) in Equation 2 by the firing rate in a particular time interval. The coefficientscn (sk ), by contrast, remain fixed. This extrapolated DPA is the basis for investigating the nonlinear interaction effects within the composite stimulus paradigm. We compare the superpositions: Equation 7of the time-resolved DPAs for two elementary stimulis i ands j with the time-resolved DPAs of composite stimuli Equation 8 Û ij meas(sk, t) is the extrapolated DPA that is based on replacing the ratefn (si ) in Equation 2 by the firing ratesfn (si, sj, t) that are observed in response to the corresponding composite stimulus.
Distributions of population activation of elementary stimuli
We constructed DPAs in response to a set of small squares of light that only differ in their position along a virtual horizontal line and that we termed elementary stimuli. The DPAs were defined in visual space and were based on single cell responses from 178 neurons recorded in the foveal representation of cat area 17. To obtain DPAs, we made use of two different approaches: (1) in a two-dimensional Gaussian interpolation procedure, the RF centers were weighted with the normalized firing rate of each neuron (Fig.1 D–F). Corresponding to the average RF profile of all neurons recorded (compare Fig. 2 A), the width of the Gaussian was chosen uniformly to 0.6°; and (2) in addition, based on the assumption that the representation of visual location can be considered as a function of activation in parameter space, we minimized the error for reconstructing one-dimensional distributions using the OLE procedure. This method is optimal in the sense that it extracts the available information from the firing rates under the condition of a least square fit.
As a reference, we calculated DPAs in the time interval between 40 and 65 msec after stimulus onset corresponding to the peak responses in the PSTHs. Both approaches yielded equivalent results. The DPAs were monomodal and centered onto each respective visual field position. For each stimulus, Figure 2 B depicts the two-dimensional DPAs of all seven elementary stimuli constructed by Gaussian interpolation. Figure 2 C shows the OLE-derived one-dimensional DPAs. The spatial arrangement of activity within these distributions implies that neurons in primary visual cortex contribute as an ensemble to the representation of visual field location, although the RF of each neuron might be broadly tuned to stimulus location.
For extrapolation, DPAs were obtained by replacing the neural activity observed in other time intervals or in response to composite stimuli.
Temporal evolution of the DPAs of elementary stimuli
The main emphasis of this study was to explore cortical interaction processes. It appears conceivable that such processes can be traced during the entire temporal structure of neuron responses because of differences of time constants of excitatory and inhibitory contributions (Bringuier et al., 1999) and because of time-delayed feedback (Dinse et al., 1990). Accordingly, as an important prerequisite, time-resolved DPAs were constructed for a number of subsequent time intervals after stimulus onset using the firing rates within each time slice as weights. Figures3 and 4illustrate the temporal evolution of the DPAs from 30 to 80 msec after stimulus onset for two selected elementary stimuli. There is a remarkable spatial coherence of activity within the ensemble. The gradual build-up and decay of activation were quite uniform across the distributions of all elementary stimuli.
On average, the DPAs constructed by Gaussian interpolation reached maximal level of activation 54 ± 4 msec after stimulus onset as compared to 53 ± 5 msec for the OLE-derived DPAs (see Fig.9 B). To quantitatively assess the accuracy with which the DPAs represent the location of the elementary stimuli position during the entire time course of responses analyzed (30–80 msec), we compared the position of the maximum of each DPA to the respective stimulus position. Figure 5 plots these constructed positions against the real stimulus positions. Results from both reconstruction methods revealed that the DPAs represent stimulus position during this investigated time window. The average deviation was 0.20 ± 0.11° for the interpolated DPAs and 0.02 ± 0.02° for the distributions based on optimal estimation. The optimal estimation allowed us to avoid reconstruction errors but might suppress systematic errors that were revealed by the interpolation procedure (Fig. 5 A). Interestingly, in a recent psychophysical study, briefly presented stimuli have been found to be mislocalized more foveally (Müsseler et al., 1999).
Nonlinear interactions: time-averaged inhibition
We addressed the question of neural interactions within the population representation. If there were no interactions within the population, then the DPAs of the composite stimuli would be predicted to be the linear superpositions of the DPAs of the component elementary stimuli. To test this null hypothesis, we build DPAs based on the same estimator used for elementary stimuli, but now weighting the contribution of each cell with the firing rate observed in response to the composite stimuli.
First, we examined interaction effects by comparing the time-averaged (from 30 to 80 msec) population representations. Figure6 illustrates the DPAs derived by interpolation; Figure 7 the OLE-derived DPAs of composite stimuli and their superpositions. Both the measured and the superimposed DPAs are monomodal for small, and bimodal for large stimulus separations, the transition occurring at ∼1.6° separation.
The most striking deviation from the linear superposition (Fig. 6,bottom; Fig. 7, dashed line) was a reduction of activity compared to the measured responses (Fig. 6, top; Fig. 7, solid line), which is particularly strong for small stimulus separations. This reduction is not caused by a saturation of population activity because it is also observed for composite stimuli of larger separations where the distributions are bimodal and have little overlap. Note that in this case the levels of activation in the composite representations are even lower than for the corresponding elementary stimuli (see Fig. 9 B, horizontal arrow). A quantitative assessment of this inhibitory interaction allows to uncover its dependence on stimulus distance. The total activation in the population distribution was computed as the area under the distribution and is expressed as a percentage of the total activation contained in the superposition. This percentage is always <100%, indicating inhibition, which is strongest for small distances and decreases with increasing distances (Fig.8).
A slight gradient of the amplitudes and the time courses within the DPAs of the elementary stimuli was assumed to account for the asymmetric deviations of the measured distributions compared to the superpositions at 1.2 and 1.6° stimulus separation (Fig. 7). Therefore, interaction processes may amplify this inhomogeneity by shifting the maximal amplitude of the distributions toward the nasally located stimulus component (for details, see “Dynamic neural field model”). Note that the inhomogeneity became additionally apparent in the superpositions of the Gaussian-interpolated DPAs (Fig. 6). In contrast to the optimal estimation procedure, this method does not normalize the small gradient of amplitudes observed in the distributions of the elementary stimuli.
Nonlinear interaction: early excitation–late inhibition
To investigate the time structure of interaction, we further analyzed the OLE-derived DPAs by comparing representations of composite stimuli either to the representations of elementary stimuli or to their superpositions. We therefore calculated the activation around the nasally positioned component because it was at the same retinal location for all composite stimuli. As a quantitative measure, we integrated activity within a band of ±0.4° around that particular visual field position (Fig.9 A, vertical arrow).
Figure 9 B (solid line) displays the temporal evolution of activity at 5 msec intervals for the different composite stimuli (illustrated in Fig. 9 A). The response to the nasally positioned elementary stimulus alone is shown as a dashed line. There are notable differences between elementary and composite stimuli in an early and a late response epoch. At small separations between the component stimuli, the response has a 7 msec shorter latency (p < 0.001, ANOVA) as compared to the single stimulus condition. This is accompanied by an earlier onset of the decay of the population activity. By contrast, the late part of the response is always inhibited.
For quantitative evaluation, we divided time into an early (30–45 msec) and a late (45–80 msec) epoch. For the early period, we compared the population representation of composite stimuli to the superpositions. Because we expect to find excitatory interaction, this is a conservative comparison, because saturation effects would tend to limit the responses. The solid line in Figure10 shows the difference between the activation in response to the composite stimuli and the activation in the superimposed responses expressed in percent of the latter. In this early response epoch, there was more activation in the measured than in the superimposed responses at all distances except the largest (2.4°). This excess activation, which reached a maximum of 58% at a stimulus distance of 1.6°, is evidence of distance-dependent excitatory interaction during the build-up phase of the DPAs of composite stimuli.
That the activation with composite stimuli exceeded even that of the superpositions demonstrates that response saturation is not the cause of the apparent inhibitory interactions observed in the time-averaged analysis. Accordingly, the time-averaged inhibitory effect (compare Figs. 6, 7) originates from the late response epoch of 45–80 msec after stimulus onset. For this epoch, the dashed line in Figure 10shows the relative difference of responses to composite as compared to elementary stimuli. At all stimulus separations, the difference is negative, indicating inhibition below the activation level for a single stimulus. This inhibition is slightly stronger for larger stimulus separations, providing further evidence for distance-dependent late inhibitory interaction. Moreover, it confirms that response saturation is not an explanation for this inhibitory effect.
Spatial interaction: repulsion effect
The neural field model predicts (see next section) that inhibitory interactions are dominant at larger distances, resulting in a repulsion effect for the apparent position of two stimulus components. We tested this prediction using the OLE-derived distributions. As described, the DPAs were bimodal at stimulus separations between 1.6 and 2.4°. In fact, at these distances we found that the maxima of the DPAs were shifted outward by ∼0.3° as compared to the corresponding maxima of the superposition (Fig. 11). This repulsion effect was particularly strong in the time window of 60–80 msec after stimulus onset, where inhibition is dominant.
Note that all results concerning interaction and temporal evolution were equivalent when obtained from the two different approaches of DPA construction.
Dynamic neural field model
A theoretical model of the temporal evolution of the population representation and the interaction effects is formulated to substantiate our theoretical interpretation of the results. The model is embedded in a general framework that bridges neuronal and behavioral levels of description (for review, see Schöner et al., 1997). The elementary stimuli flashed at different positions on a horizontal line in the visual field are thought of as defining a one-dimensional space, in which the dependence of interaction on distance is probed. At each position, x, an activation variable,U(x), is introduced that defines a field of neural activation along the horizontal dimension of visual space.
This neural field is assumed to evolve continuously in time under two different types of inputs: (1) afferent input from sensory stimulation activates those regions of the field that represent the specified values of the parameter space; and (2) inputs from interaction processes within the field exert excitatory or inhibitory effects onto the field. What locations excite or inhibit each other is determined by interaction kernels w u(x) and w v(x), respectively. These are derived under the assumption that nature and strength of the interactions between different sites in the field depend on the distance between those sites. The identification of appropriate kernels, which can explain the overall time scale of build-up and decay as well as the spatial width of the measured population responses, is thus the primary modeling task. The modeling is not aimed to reproduce the experimental data in all detail, but to identify a simple mathematical description that can be used to support and clarify the interpretation of the main experimental findings.
As a rule, the response of the neural population to briefly flashed visual stimuli is transient. The time structure of the DPAs reveals dynamic properties of the cortical neural network that go beyond passive filtering. We refer to such responses as active or self-generated transients. To account for this nontrivial time structure of the population response, we introduce a second variable at each site of the field. This variable is excited by activation in theu field and inhibits, in turn, that field at the corresponding site.
The mathematical description we use is: Equation 9A similar mathematical framework has been used by Amari (1977) to discuss the dynamics of pattern formation in cortical neuronal tissues. He focused primarily on stable stationary states, consisting of localized peaks of activation, whereas only spatially homogeneous (nonlocal) patterns were studied as transient solutions.
The lateral connections are functions of the distance (x − x′) of positions x, x′ in visual space. Numerical studies with different types of kernels (e.g., Gaussians, exponentials, rectangular forms) revealed that the interesting qualitative properties of the solutions of Equation 9 are largely independent of the particular analytical form of the kernelsw u andw v, as long as they preserve characteristic relationships of amplitude of inhibition and excitation as well as of the spatial range of these two factors. The simulations shown below are based on two Gaussians: Equation 10 where the amplitudes Au, Av and the range parameters ςu, ςv are positive constants. A general constraint arises from the requirement that the excitatory response does not spread out. This imposes that the spatial range of inhibitory interactions must exceed that of excitatory interactions (ςv > ςu).
The threshold function F in Equation 9 must be monotonically increasing and nonlinear, but again its particular functional form is of little importance for the qualitative behavior of the field dynamics. We used the well-known sigmoidal functionF(u) = 1/[1 + exp(−bu)]. For given interaction kernels, a lower limit for the slope b > 0 can be obtained such that the existence of self-generated, transient responses is guaranteed.
The interaction terms are multiplied by the state-dependent sigmoidal signal F(u). This factor prevents the asymptotic transient response to fall below resting level because only those sites in the field that are sufficiently activated are susceptible to inhibitory interaction.
The parameter τ, Equation 9, determines the overall time scale of build-up and decay of the field activity and can be adjusted to reproduce qualitatively the measured time course of population activity changes. In the numerical studies, we have used the value τ = 15. A fixed criterion (5% above resting level) was used to define the response onset in the experiments. For the simulations, the afferent transient stimulus S(x,t) at positionx, applied for a duration Δt = 25 msec, is a Gaussian profile characterized by its strength,A s, and width parameter, 2ς. The choice of ς fixes the spatial units relative to the experimental space scale. All range parameters used in the model simulations were chosen as multiples of ς = 5, which represents 0.2° in visual space.
If this transient external input creates enough excitation within the field, the excitatory response develops a single spatial maximum located at the center, x, of the stimulated segment. This is followed by a process of relaxation to the resting state driven by increasing inhibition in the field. The activation level of this resting state is a homogenous and stable solution of the model dynamics, fixed by the parameter h < 0 (h = −3 for the simulations shown here).
Figure 9 compares the temporal evolution of population activation in the experiment (B) and in the model (C). Composite stimuli with six spatial separations were used. The same normalization procedures for the simulated data were applied as for the experimental data. To further facilitate the comparison of theory and experiment, a time interval of 25 msec before stimulus onset was added, so that the field dynamics has relaxed to its resting state. This time window accounts for the temporal delay between the stimulus presentation and the cortical response in the experiment.
Distance-dependent early excitation and late inhibition are observed by comparing the temporal evolution of the field in response to the single input at the nasal location. Note that in the experiment, the limit case of two independent peaks not interacting at all is not reached even at the largest probed distances between the component stimuli. At that largest separation, an inhibition effect can still be seen in the time course of activation (Fig. 9 B,C, horizontal arrows).
In the spatial domain, nonlinear interactions are observed as differences in shape and location of the time-averaged spatial profile (Fig. 12 A,B) of the calculated superposition compared to the composite stimulation. In the model, the two excited regions attract each other to unite into one excited region when they interact directly through the excitatory connections. Conversely, when two peaks of activation are induced at somewhat larger distances, they interact primarily through the longer range inhibitory interactions, and this leads to the documented repulsion of the two peaks.
To further emphasize the role of time in the interaction process, we have explored the influence of a small inhomogeneity (up to 5 msec) in the temporal evolution of the field on the emerging activity patterns. A slightly faster growth of activation at one field site causes an asymmetry in the competition strength between neighboring activity peaks. In each time step, the activity-dependent strength of inhibition exerted by the other local excitation is always smaller for the temporally privileged location. This imbalance finally leads to a difference in peak amplitude at the two stimulation sites. Note, however, that the averaged superposition profile can still be symmetric when the mechanism that causes the difference in the temporal evolution has little effect on the maximum peak response of the elementary stimuli. This condition can easily be met by introducing a position-dependent slight variation of the input strength (compare Figs. 7, 13).
Effects of interaction across distributed cortical representations are widely discussed as an important aspect of cortical function. If interaction contributes significantly to neural activation in visual cortex, then representations of the visual environment will differ from a simple feedforward remapping of visual space. To investigate the presence and magnitude of interaction processes in cat primary visual cortex, we constructed DPAs from the activity of an ensemble of neurons in response to single squares of light.
Construction of distributions of population activation
Using two different approaches, DPAs were defined in parameter space of the visual field which enabled us to analyze ranges of excitatory and inhibitory interactions in terms of the stimulus metrics.
Instead of asking how accurately the parameter of stimulus location can be reconstructed or decoded, we primarily were interested in analyzing interaction-based deviations of population representations dependent on defined variations of stimulus configurations. Accordingly, there is an important point of departure from the interest we share with aspects relating to estimation theory. Our analysis aimed to investigate how the representation of retinal position evolves in time and how it is affected by interaction among neurons. Besides, reading out discrete sample points such as peak maxima does not imply that the brain actually uses such measures for decoding.
When composite stimuli consisting of two squares of light were used, the deviations of the distributions from additivity were considered as active contributions from neural interaction, i.e., how interactions distort the distribution of activation. We conclude that such contributions can be regarded as additional information generated by the neural system dependent on context and its actual state.
It is important to note that both approaches used to derive DPAs revealed qualitatively equivalent results, implying that the exact way of how the distributions were constructed was not crucial for the observed interaction effects.
Interaction within the population representation of composite stimuli
The use of time-resolved DPAs allowed us to identify signatures of interaction processes that were dependent on time and on the distance of the composite stimuli. In the first 30–45 msec after stimulus onset we found evidence for excitatory interaction, which decreases with increasing distance between the two components. In contrast, when activation was integrated over the later part of the response (45–80 msec after stimulus onset), inhibitory interaction dominated. We provided several arguments that exclude saturation of neural firing rates as an alternative explanation.
An additional indication for the presence of inhibitory interaction was found by analyzing the spatial shapes of the DPAs. Mutual repulsion of the maxima of the DPAs was observed at stimulus separations between 1.6 and 2.4°, at which the distributions were bimodal. Such repulsion effects qualitatively match psychophysical results obtained from humans. Errors incurring when human subjects estimate the visual distance between two spots of light depend systematically on the retinal distance of the stimuli. Small separations are underestimated, large distances are overestimated (Hock and Eastman, 1995). Similar results have been obtained for estimation of the orientation of stimuli (Westheimer, 1990; for theoretical modeling see Lehky and Sejnowski, 1990). In addition, mislocation effects have been described for other sensory modalities, such as the tactile saltation effect (Geldard and Sherrick, 1972), supporting the assumption of a general cortical nature of such phenomena (Kalt et al., 1996).
Dynamic neural field model
A dynamic neural field model was introduced for theoretical treatment of the dynamics of neural population activity (Schöner et al., 1997). Models of the same mathematical format have been proposed in the past as models of dynamic cortical processing (Wilson and Cowan, 1973; Amari, 1977). It is important to note that the entire set of experimental results could be accounted for from a single set of parameter values. The construction of the population representation was used to map neural data onto the visual field. Correspondingly, the neural field was likewise defined over visual space. The activation variables u and v in Equation 9 represent the accumulated excitation and inhibition within the population of neurons. The structure of the postulated interaction function consists of both excitatory and inhibitory coupling. Because the amplitude of the excitatory contribution to interaction is higher and its spatial extent is narrower than for the inhibitory contribution, the net interaction within the representation is excitatory over short distances in visual space, and inhibitory at larger spatial separation.
The absolute values of range parameters used for the numerical studies revealed that the excitatory and the inhibitory processes extend over a range of 0.6 and 1.0° of visual field, respectively. The strength of inhibition and excitation strongly influences the width of the emerging activity distribution, and thus the spatial separation at which a transition from a monomodal to a bimodal representation occurs. Our simulations showed that even those representations that overlap only for the smallest separation still can reveal the effect of late inhibition and early excitation, indicating that the width of the distributions has only little effects on the time course of interaction. A small number of parameters were sufficient for modeling the complex spatiotemporal responses from many different cell types combined at a population level.
Relationship of our results to single cell analysis
Interaction profiles have been repeatedly examined at the level of single cells (Movshon et al., 1978; Heggelund, 1981a,b; Nelson, 1991;Tolhurst and Heeger, 1997). In those studies, the activity of a cell induced by a single stimulus at the RF center was compared to the activity of the cell in the presence of a second stimulus presented at varied locations.
In contrast, the population approach used here performs two different types of averages. First, because our stimuli were not RF-centered, we average across different spatial locations within the RFs (cf.Szulborski and Palmer, 1990). Outside the laboratory, visual objects are similarly distributed in arbitrary ways across RFs, so that this way of stimulus presentation and averaging is crucial for an understanding of how complex scenes are represented in visual cortex.
Second, we average across many different cell types. Neurons in area 17 contribute potentially to the representations of many different parameters such as retinal position, orientation, curvature, length, motion direction, etc. To characterize the contribution of each neuron to the representation of stimulus location, one might conceive of the high-dimensional space spanned by these different parameters. Each neuron could be thought of as a point in this parametric space. This point corresponds to a set of preferred values for all represented parameters. By asking only how the firing rate of the neuron depends on visual field position, the contributions of all neurons are averaged, although their preferred parameter set may be different along other dimensions. In this sense, the DPA is a projection from a potentially high-dimensional space onto a common neuronal space representing only visual field position. The DPA could thus be viewed as a neural population receptive field of the inverted cortical point-spread function (“cortical spread-point function”).
The shape of the DPA matters
Population coding ideas have largely been centered on estimating the stimulus or task parameter from the activity of populations of neurons (Georgopoulos et al., 1986, 1993; Vogels, 1990; Zohary, 1992;Seung and Sompolinsky, 1993; Salinas and Abbott, 1994; Groh et al., 1997). Compared to vector-based population techniques, the current approach focused on the concept of an entire distribution of population activation (Lee et al., 1988; Bastian et al., 1997; Pouget et al., 1998) (for related attempts, see Anderson, 1994a,b; Zemel et al., 1998, in which they seek to recover a probability distribution of activity over the encoded variable). In our approach, the distribution is significant not only by a mean value of the represented parameter, but also through its shape. Consequently, asymmetric deformations of the DPA could be detected, in which two peaks in the DPA are repelled from each other at sufficiently large stimulus separations. This effect is observable only by taking the shape of the constructed DPA into account and would be detectable neither on the basis of PSTH responses of individual cells nor in reconstructions that estimate only single values or discrete samples of parameters.
Relationship of our results to cortical maps
In principle, our time averaged two-dimensional DPAs are equivalent to activities recorded in functional imaging studies such as functional magnetic resonance imagine, positron emission tomography, and optical imaging of intrinsic signals assuming a clean retinotopy. There are a number of differences, however. Besides the limitations of these techniques to resolve the millisecond time scale as accomplished by our single cell recordings, the main problem arises from the fact that the retinotopy is far from coming close to a clean representation of the visual field (cf. Das and Gilbert, 1997). This is particularly obvious at the spatial scale of our investigation, which differentiates between visual angles <1° apart (Hubel and Wiesel, 1962; Albus, 1975). Analysis of the cortical point-spread function has shown that the processing of even very small stimuli is associated with a widespread pattern of cortical activation (Grinvald et al., 1994; Godde et al., 1995; Chen-Bee and Frostig, 1996). In addition, imaging methods as listed above do not solely reflect spike activity but include contributions from glial cells and cerebral blood flow. Accordingly, comparison of DPAs spanned in parametric space with cortical activation maps recorded with such imaging techniques may allow separating neural and non-neural contributions.
A dynamically distributed processing over a large cortical area possibly reflects a major role in neural strategies of cooperative interaction. Observations in real-time imaging studies supported this assumption because the firing of single neurons can be predicted if the whole pattern of cortical population activation is taken into account (Arieli et al., 1996; Kenet et al., 1998). Because our approach allows for a functional interpretation of cortical activation patterns, it may serve to find transformation rules that map the multidimensional visual input onto cortical representations.
This work was supported by grants from the Deutsche Forschungsgemeinschaft (Scho 336/4-2 to G.S. and Di 334/5-1,3 to H.D.). We thank Dr. Alexa Riehle and Annette Bastian for discussion, Dr. Christoph Schreiner for helpful comments on an earlier version of this manuscript, and David Kastrup for proofreading.
Correspondence should be addressed to Dr. Dirk Jancke, Institut für Neuroinformatik, Theoretische Biologie ND 04, Ruhr-Universität Bochum, D-44780 Bochum, Germany.