WWW.JNEUROSCI.ORG
-
The Journal of Neuroscience Introducing ALZET?ew Model 2006 Pump
 QUICK SEARCH:   [advanced]


     
-


HOME
  |  
SEARCH  |   ARCHIVE  |   SUBSCRIBE  |   CONTACT  |   HELP

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Submit an eLetter
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (246)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Carandini, M.
Right arrow Articles by Movshon, J. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Carandini, M.
Right arrow Articles by Movshon, J. A.

 Previous Article  |  Next Article 

Volume 17, Number 21, Issue of November 1, 1997 pp. 8621-8644
Copyright ©1997 Society for Neuroscience

Linearity and Normalization in Simple Cells of the Macaque Primary Visual Cortex

Matteo Carandini1, David J. Heeger2, and J. Anthony Movshon1

1 Howard Hughes Medical Institute and Center for Neural Science, New York University, New York, New York 10003, and 2 Department of Psychology, Stanford University, Stanford, California 94305

ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
MODEL
RESULTS
DISCUSSION
FOOTNOTES
APPENDIX
REFERENCES


ABSTRACT

Simple cells in the primary visual cortex often appear to compute a weighted sum of the light intensity distribution of the visual stimuli that fall on their receptive fields. A linear model of these cells has the advantage of simplicity and captures a number of basic aspects of cell function. It, however, fails to account for important response nonlinearities, such as the decrease in response gain and latency observed at high contrasts and the effects of masking by stimuli that fail to elicit responses when presented alone. To account for these nonlinearities we have proposed a normalization model, which extends the linear model to include mutual shunting inhibition among a large number of cortical cells. Shunting inhibition is divisive, and its effect in the model is to normalize the linear responses by a measure of stimulus energy. To test this model we performed extracellular recordings of simple cells in the primary visual cortex of anesthetized macaques. We presented large stimulus sets consisting of (1) drifting gratings of various orientations and spatiotemporal frequencies; (2) plaids composed of two drifting gratings; and (3) gratings masked by full-screen spatiotemporal white noise. We derived expressions for the model predictions and fitted them to the physiological data. Our results support the normalization model, which accounts for both the linear and the nonlinear properties of the cells. An alternative model, in which the linear responses are subject to a compressive nonlinearity, did not perform nearly as well.

Key words: visual cortex; contrast; nonlinearity; gain control; normalization; masking; noise


INTRODUCTION

A longstanding view of simple cells in the primary visual cortex is that they compute a weighted sum of the light intensities falling on their receptive field (Hubel and Wiesel, 1962; Movshon et al., 1978a; Carandini et al., 1997b). This linear model is depicted in Figure 1A and is usually taken to include a rectification (thresholding) stage to account for the transformation of intracellular signals into firing rates.


Fig. 1. Two models of simple cell function. A, The linear model, composed of a linear stage (receptive field) and a rectification stage. The linear stage performs a weighted sum of the light intensities over local space and recent time. This sum is converted into a positive firing rate by the rectification stage. Rectification is a nonlinearity, so the "linear model" is not entirely linear. B, The normalization model extends the linear model by adding a divisive stage. The linear stage feeds into a circuit composed of a resistor and a capacitor in parallel (RC circuit). The conductance of the resistor grows with the pooled output of a large number of cortical cells. This effectively divides the output of the linear stage.

[View Larger Version of this Image (25K GIF file)]


Although many aspects of simple cell responses are consistent with the linear model, there also are important violations of linearity. For example, scaling the contrast of a stimulus would identically scale the responses of a linear cell. At high contrasts, however, the responses of simple cells show clear saturation (Maffei and Fiorentini, 1973). Moreover, simple cells are subject to cross-orientation inhibition; the responses to an optimally oriented stimulus can be diminished by superimposing an orthogonal stimulus that is ineffective in driving the cell when presented alone (Morrone et al., 1982; Bonds, 1989; Bauman and Bonds, 1991).

According to a view that has emerged in recent years, the nonlinearities of simple cells could be explained by extending the linear model to include a gain control stage (Albrecht and Geisler, 1991; Heeger, 1991, 1992b, 1993; DeAngelis et al., 1992; Carandini and Heeger, 1994; Nestares and Heeger, 1997; Tolhurst and Heeger, 1997a,b). In particular, one of us (Heeger, 1991, 1992b) proposed a normalization model (Fig. 1B), in which the linear response of every cell is divided (or "normalized") by a number that grows with the activity of a large number of cortical cells, the normalization pool. The normalization model attributes the selectivity of a cell to the initial linear stage and its nonlinear behavior to the division stage. For example, the model predicts response saturation because the divisive suppression increases with stimulus contrast, and the model predicts cross-orientation inhibition because the normalization pool includes neurons with a wide variety of tuning properties, many of which respond to orthogonal gratings.

Previously, we have suggested a possible biophysical implementation of the normalization model (Fig. 1B) (Carandini and Heeger, 1994). The cell membrane is modeled as an RC circuit, composed of a resistor and a capacitor in parallel. The linear stage injects synaptic current into the cell, and normalization operates by controlling the conductance of the resistor, i.e., the membrane conductance. The cells in the normalization pool effectively inhibit each other by increasing the membrane conductance of each other. This shunting inhibition controls the gain of the transformation of input current to output membrane potential. A rectification stage converts the latter into a firing rate.

To test this model against large data sets obtained in monkey primary visual cortex, we recorded the responses of simple cells in area V1 of paralyzed, anesthetized macaques, while presenting a variety of visual stimuli. These stimuli included drifting gratings, plaids composed of two drifting gratings, and drifting gratings superimposed on full-screen spatiotemporal white noise. The gratings had a wide range of contrasts, temporal frequencies, spatial frequencies, and orientations. We derived equations for the model responses to such stimuli, and we found that these equations provided good fits to the neural responses.

Portions of this work have been presented briefly elsewhere (Carandini and Heeger, 1994, 1995).


MATERIALS AND METHODS

Experiments were performed on five cynomolgus macaque monkeys (Macaca fascicularis) and four pigtail macaque monkeys (M. nemestrina) ranging in weight from 1.5 to 4 kg.

Preparation and maintenance Animals were initially anesthetized with ketamine HCl (10 mg/kg) and premedicated with atropine sulfate (0.05 mg/kg) and acepromazine maleate (0.1 mg/kg). Anesthesia continued on 1.5-2.0% halothane in a 98% O2-2% CO2 mixture while the initial surgery was performed. Indwelling catheters were introduced into the saphenous veins of each hindlimb, and a tracheotomy was performed.

The animal was then mounted in a stereotaxic instrument, and halothane anesthesia was replaced by a continuous infusion of sufentanil citrate (typically 4-6 µg·kg-1·hr-1, beginning with a loading dose of 4 µg/kg). EEG, ECG, and arterial blood pressure were monitored continuously, and any signs of arousal were corrected by modifying the rate of anesthetic infusion. The monkey was artificially respirated with a mixture of O2, N2O, and CO2 adjusted so that end-tidal CO2 was maintained at 3.8-4.0%. Rectal temperature was kept near 37°C with a heating pad.

A small craniotomy was performed, usually 9-10 mm lateral to the midline and 3-4 mm posterior to the lunate sulcus. This location often yielded two encounters with the primary visual cortex, with eccentricities first at ~2-5° and then at ~8-15°. A small slit in the dura was made, and a vertical hydraulic microdrive containing a glass-coated tungsten microelectrode (Merrill and Ainsworth, 1972) in a guide tube was positioned. The craniotomy was covered with a chamber containing 4% agar in sterile saline solution.

On completion of surgery, animals were paralyzed to minimize eye movements. Paralysis was maintained with an infusion of vecuronium bromide (Norcuron, 0.1 mg·kg-1·hr-1) in lactated Ringer's solution with dextrose (5.4 ml/hr). The pupils were dilated and accommodation paralyzed with topical atropine. The corneas were protected with zero power gas-permeable contact lenses; supplementary lenses were chosen to focus the eyes on a tangent screen plotting table set up at a distance of 57 in. To maintain the animal in good physiological condition during experiments (typically 72-96 hr), intravenous supplementation of 2.5% dextrose/lactated Ringer's was given at 5-15 ml/hr. Animals received daily injections of a broad-spectrum antibiotic (Bicillin) as well as an anti-inflammatory agent (dexamethasone) to prevent cerebral edema.

Stimuli Stimuli were generated by a Truevision ATVista board operating at a resolution of 582 × 752 and a frame rate of 106 Hz, the output of which was directed to a Nanao T560i monitor (mean luminance, 72 cd/m2, subtending 10-25° of visual angle). Nonlinearities in the relation between applied voltage and phosphor luminance were compensated by appropriate look-up tables. Stimulus strength is measured in units of contrast, defined as the difference between the highest and lowest intensities, divided by the sum of the two.

Drifting luminance-modulated sinusoidal gratings were presented alone or superimposed on another grating or on a noise background. Superposition was obtained by interleaving, i.e., by presenting the two components in alternate frames. When two gratings were presented together they had the same temporal frequency and differed in orientation and/or spatial frequency. Their contrast could be varied independently. The noise background was composed of square pixels, the size of which was chosen for each cell to be approximately one-fourth of the spatial period of the optimal grating. Occasionally we used one-dimensional noise (bars rather than squares). The intensity of each square was randomly refreshed at 13.4 or 26.8 Hz and assumed one of two possible values.

All the stimuli had the same mean luminance. The grating and plaid stimuli were vignetted by a square window, the size of which was chosen to elicit the maximal responses. The noise masks occupied the whole screen. In their absence the surrounding field was uniform.

Experiments. Experiments consisted of two to nine consecutive blocks of stimuli. Each block consisted of a random permutation of 5-90 stimuli. Randomization was adopted to minimize the effects of adaptation and other nonstationarities. The stimuli had equal duration (generally 5-10 sec) and were separated by uniform field presentations lasting about 4 sec.

Experimental protocol. Receptive fields were initially mapped by hand on a tangent screen. When the activity of a single neuron was isolated, we established the dominant eye of the neuron and occluded the other eye. We then positioned the receptive field on the face of the monitor, and quantitative experiments proceeded under computer control.

To characterize each cell we performed the following sequence of measurements using single gratings: (1) orientation and direction tuning; (2) spatial frequency tuning; (3) temporal frequency tuning; and (4) stimulus size tuning. Each of these measurements was performed at the optimal values of the parameters as obtained from the previous measurements. Cells were classified as simple or complex on the basis of the frequency component of their response to the drifting grating eliciting the maximum number of spikes, as classified by Skottun et al. (1991). If the cell was simple we proceeded to the core experiments in this study. These were of three types:

(1) Grating matrix experiments, consisting of drifting sinusoidal stimuli having 5-10 different contrasts, two to four different temporal frequencies, and two to four different orientations or spatial frequencies. A typical experiment would involve three orientations or spatial frequencies, three temporal frequencies, and five contrasts, yielding a total of 45 stimuli.

(2) Plaid experiments, consisting of sums of two gratings with contrasts that were independently varied. Often the two directions were opposite, and the "plaid" was a counterphase flickering grating. A typical experiment would involve two orthogonal gratings with contrasts that assumed five possible values, yielding a total of 25 different stimuli.

(3) Noise-masking experiments, in which the contrast response to drifting gratings was measured in the presence of noise at different contrasts. A typical experiment would involve nine grating contrasts and two noise contrasts (0 and 0.5), yielding a total of 18 different stimuli.

Data analysis Amplified and bandpass-filtered signals from the microelectrode were fed into a hardware window discriminator. A computer interface (Cambridge Electronic Design 1401 Plus) collected the pulses triggered by each action potential and the synchronization signals from the video graphics board.

Response measure. Our measure of cell response is the first harmonic r of the spike trains, a complex number indicating the amplitude and phase of the best-fitting sinusoid having the same temporal frequency as the stimulus. This number is obtained from the spike train by computing r = (1/D)Sigma k cos(2pi ftk) + i sin(2pi ftk), where D is the stimulus duration, f is the temporal frequency of the stimulus, and the tk are the times of the individual spikes. The amplitude of the first harmonic has units of spikes per second. The responses r obtained in an experiment constitute a matrix r = {rs,b}, where the subscripts indicate the sth stimulus presented in the bth stimulus block. We denote the mean across blocks of the responses as the vector <A><AC>r</AC><AC>&cjs1171;</AC></A> = {rs}. For example, in an experiment in which three blocks of 25 different stimuli were run, the matrix r would contain 75 elements, and the vector <A><AC>r</AC><AC>&cjs1171;</AC></A> would contain 25 elements.

Correction for eye movements. Inspection of the spike rasters often revealed a few discrete misalignments across stimulus blocks in the responses to individual stimuli, which are best explained by the presence of small eye movements. For drifting grating stimuli the sole effect of these eye movements would be a shift in response timing. We reduced this effect by shifting in time all the responses in each block by an amount chosen to minimize sigma s2, the variance across blocks of the responses. Because all the responses in a block are translated by the same amount, this method would completely remove the effect of the movements only if they occurred exactly between blocks. In all other cases it is just an approximation that reduces the variance of the data. No attempt was made to correct the effect of possible eye movements on the responses to plaids or to gratings in the presence of noise.

Estimation of the variance. The number of blocks in our experiments (two to nine) was not sufficient to obtain reliable estimates of the variance sigma s2 of the responses to each stimulus s. For this reason we estimated the dependence of sigma s2 on |rs|, the amplitude of the mean responses. As a functional form for this dependence we chose the simple relation sigma s2 = alpha  |rs|beta , where alpha  and beta  are free parameters. This expression provided very good fits to the data. In the fits, the scale factor alpha  was on average 2.11 ± 0.18, and the exponent beta  was on average 1.18 ± 0.02, consistent with previous findings that the variance of the responses of V1 neurons is proportional to their mean (Dean, 1981b; Tolhurst et al., 1983; Bradley et al., 1987; Vogels et al., 1989).

Model fits. The models discussed in Results were fit to the responses to all stimuli in an experiment. Different experiments were fitted independently and thus yielded different sets of parameters. To fit the predictions of a model m = {ms} to the data we performed a weighted least squares fit; i.e., we searched for the parameters a that minimized the error function
<UP>Error</UP>(<UP><B>a</B></UP>)=<LIM><OP>∑</OP><LL><UP>s</UP></LL></LIM>‖m<SUB><UP>s</UP></SUB>(<UP><B>a</B></UP>)−r<SUB><UP>s</UP></SUB>‖<SUP>2</SUP>/&sfgr;<SUP>2</SUP><SUB><UP>s</UP></SUB>,
where the sigma s2 are the estimated variances. To avoid giving too much importance to data points of low amplitude, when fitting the models of the visual responses we took all the sigma s2 < 1 to be equal to 1.

Percentage of the variance. To gain an intuitive assessment of the quality of the fits provided by a model, we computed the percentage of the variance across stimuli for which the model accounted. To define this measure it is useful to consider the (mean square) distance between two sets of responses x = {xs} and y = {ys}:
d(<UP><B>x</B></UP>, <UP><B>y</B></UP>)=1/N <LIM><OP>∑</OP><LL><UP>s</UP></LL></LIM>‖x<SUB><UP>s</UP></SUB>−y<SUB><UP>s</UP></SUB>‖<SUP>2</SUP>,
where the sum is over the stimuli s, and N is the number of stimuli. The percentage of the variance accounted for by the model may then be expressed as:
<UP>%variance</UP>=100 ∗ [1−d(<UP><B>m</B></UP>, <A><AC><UP><B>r</B></UP></AC><AC>&cjs1171;</AC></A>)/d(<A><AC><UP><B>r</B></UP></AC><AC>&cjs1171;</AC></A>, <A><AC><UP><B>r</B></UP></AC><AC>&cjs1170;</AC></A>)],
where <A><AC>r</AC><AC>&cjs1170;</AC></A> is the response mean computed across stimuli and across blocks. In this expression, the numerator is the distance between the model predictions and the mean cell responses; the denominator is the variance across stimuli of the mean cell responses. For example, if the model predicts the mean responses exactly, then it accounts for 100% of the variance. More realistically, if the mean error between the model predictions and the responses is d(m, <A><AC>r</AC><AC>&cjs1171;</AC></A>) = 10 spikes/sec, and the responses in the data set have very different amplitudes and/or phases, so that their variance is large, say d(<A><AC>r</AC><AC>&cjs1171;</AC></A>, <A><AC>r</AC><AC>&cjs1170;</AC></A>) = 100 spikes/sec, then the model accounts for 90% of the variance in the data.

Bootstrap test. Although the percentage of the variance is an intuitive measure of the quality of the fits, it has the disadvantage of taking into account only the variability across stimuli and not the variability across blocks. If a cell were very noisy, our experiments would yield bad estimates of its mean responses rs; in this case the model would account for a small percentage of the variance in the data even if it reflected the exact physical reality underlying the responses. To test the quality of the model predictions taking into account all the statistical properties of the data, we performed a bootstrap hypothesis test (Efron and Tibshirani, 1991). The advantage of bootstrapping is that it does not assume that the response variability follows a particular (e.g., Gaussian) distribution.

We tested whether we could reject the null hypothesis that the mean of the probability distribution underlying the neural responses was identical to the predictions of the model. Let rb be the vector of responses obtained in the b-th block of stimuli. If for example an experiment involved 25 different stimuli and was repeated four times, there would be four vectors of responses, r1, r2, r3, and r4, and each would contain 25 elements. Let m be the prediction of the model obtained by fitting all the rb. The null hypothesis states that the mean µr of the probability distribution from which the rb are drawn is identical to the prediction of the model:
<UP>H</UP><SUB>0</SUB>: &mgr;<SUB><UP>r</UP></SUB>=<UP><B>m</B></UP>.

As a test statistic we chose the distance between the model predictions and the empirical average of the responses:
t=d(<UP><B>m</B></UP>, <A><AC><UP><B>r</B></UP></AC><AC>&cjs1171;</AC></A>).
Having observed a value tobs by evaluating the test statistic on the actual experimental data, we calculated the probability of observing at least that large a value if the null hypothesis were true. This probability is the achieved significance level (ASL) of the test:
<UP>ASL</UP>=<UP>Prob</UP>{t≥t<SUP><UP>obs</UP></SUP>‖<UP>H</UP><SUB>0</SUB>}.
The smaller the ASL, the stronger the evidence against H0.

To compute the ASL with the bootstrap method, we converted our data set r into one with an empirical distribution function that obeyed H0. This was simply done by shifting the data so that the mean responses were exactly equal to the model predictions, &rtilde; = r - <A><AC>r</AC><AC>&cjs1171;</AC></A> + m (Efron and Tibshirani, 1993). We then computed the bootstrap estimate of the ASL by repeating the following steps 1000 times: (1) Draw a sample data set r* with replacement from &rtilde;. For example, if the experiment was repeated four times, a possible draw would be r* = {&rtilde;4&rtilde;1&rtilde;2&rtilde;2}; another one could be r* = {&rtilde;2&rtilde;1&rtilde;2&rtilde;3}, and so on. (2) Compute the test statistic on the sample, t* = d(m, <A><AC>r</AC><AC>&cjs1171;</AC></A>*).

The bootstrap estimate of the achieved significance level of the test is equal to the percentage of samples for which the t* values are larger than the observed value tobs.


MODEL

The normalization model is depicted in Figure 1B. To keep the model mathematically tractable, we adopt a number of simplifications. To begin, we define the driving current of a simple cell to be the current that would be measured by clamping the voltage of the cell at rest. Then we assume that (1) the relation between the visual stimuli and the driving current is linear; (2) the cell membrane is a single passive compartment; (3) the firing rate is a rectified copy of the membrane potential; (4) cells inhibit each other (possibly through inhibitory interneurons) by increasing the membrane conductance of each other; and (5) the pool of cells that inhibit each other contains cells tuned to a wide variety of stimulus attributes.

The linear stage. As a visual stimulus is projected on the retina it can be described by its light distribution, l(x,y,t), which varies in the two spatial dimensions x,y and in time t. This representation ignores the color of the stimulus and assumes monocular viewing. The light distributions of the stimuli used in this study modulated about a fixed mean <A><AC>l</AC><AC>&cjs1171;</AC></A>. In these conditions the output of the retina is to a first approximation proportional to the local contrast, c(x,y,t) = [l(x,y,t- <A><AC>l</AC><AC>&cjs1171;</AC></A>]/<A><AC>l</AC><AC>&cjs1171;</AC></A> (Shapley and Enroth-Cugell, 1984). We will use the term contrast and the symbol c (without arguments) to denote the maximal value of the local contrast c(x,y,t). A uniform field has zero contrast, whereas a grating modulating between zero and twice its mean intensity has unit contrast.

We consider the driving current in simple cells to be linearly related to the output of the retina and thus to the local contrast. The driving current Id(t) is obtained by weighting the local stimulus contrast c(x,y,t) at each location and time by the value of the receptive field W of the cell at that location and at that time, and by algebraically summing the results:
I<SUB><UP>d</UP></SUB>(t)=<LIM><OP>∭</OP></LIM> W(x,y,T) c(x,y,t<UP>−</UP>T) dx dy dT. (1)
This linear equation is at best an approximation. Possible biophysical conditions that would lead to it being exact were suggested in a previous study (Carandini and Heeger, 1994), and are summarized in Discussion.

In this study, the driving current Id (and thus the receptive field W) will be estimated rather than measured directly. Direct measurement of Id would require intracellular in vivo voltage-clamp experiments.

RC circuit. We adopt an extremely simplified biophysical model of a cell membrane: a circuit composed of a resistor and a capacitor arranged in parallel (RC circuit). According to this model, the membrane potential V(t) obeys the following equation:
CdV/dt+gV=I<SUB><UP>d</UP></SUB>, (2)
where C is the membrane capacitance, g(t) is the total membrane conductance, and Id(t) is the driving current. In the absence of visual stimuli the driving current is zero, and the membrane potential is driven to its resting value, which we have taken to be zero.

Rectification. As a first approximation, the transformation from the membrane potential V to the spike rate R can be modeled by rectification (Movshon et al., 1978b; Jagadeesh et al., 1992; Carandini et al., 1996). Rectification is a function that is zero for membrane potentials below a threshold, Vthresh, and grows linearly: R(tproportional to  max(0, V(t- Vthresh). This function is depicted for three different values of the threshold Vthresh by the straight lines in Figure 2A.


Fig. 2. Interrelations and effects of the principal variables in the normalization model. A, Relation between membrane potential V and firing rate R. For simplicity in this study the resting potential is taken to be V = 0. The thick, intermediate, and thin lines depict rectification with thresholds Vthresh = 0, 6, and 12 mV, respectively. The dashed curves indicate approximations to rectification obtained with power functions, with exponents n = 2 (thick dashes) and n = 3 (thin dashes). B, Relation between pool activity and membrane conductance. The abscissa plots the overall response of the pool, k Sigma  R; the ordinate plots the increase in membrane conductance g/g0 - 1 (Eq. 4). C, Effects of conductance on the size and time course of the membrane potential responses. The curves are the membrane potential responses to a current step with onset at time zero, for three different values of the conductance g. As the conductance doubles (thin to thick curves), it reduces both the gain and the time constant of the cell.

[View Larger Version of this Image (28K GIF file)]


Rectification is however not very easily handled in mathematical derivations. We thus approximate rectification (Vthresh > 0) with half-rectification (Vthresh = 0) followed by elevation to the power n:
R ∝ <UP>max</UP>(0, V)<SUP>n</SUP>. (3)
The quality of this approximation is shown by the dashed curves in Figure 2A. The value of the exponent n grows with the distance of the threshold Vthresh from the resting potential Vrest. If the threshold is very close to rest, then n approx  1 ("half-rectification"). If the threshold is a bit above rest, e.g., 6 mV higher, then n approx  2 ("half-squaring"). If the threshold is far above rest, then n approx  3 or more.

Conductance and cortical activity. We now make the central assumption that cells belong to a normalization pool, the members of which inhibit each other by increasing the conductance g of each other. This form of inhibition is known as shunting inhibition and unless all the neurons in the pool are inhibitory would require the presence of inhibitory interneurons.

The particular function that we choose to relate the conductance g and the overall activity of the pool Sigma  R is illustrated in Figure 2B. Its mathematical expression is
g=g<SUB>0</SUB><FENCE><RAD><RCD>1−k <LIM><OP>∑</OP></LIM> R</RCD></RAD>,</FENCE> (4)
where the parameter k determines the effectiveness of the normalization pool. This function is completely ad hoc and is not currently supported by physiological evidence. Our reasons for choosing it are evident in Appendix, in which we derive closed form equations for the responses of the model.

The membrane conductance g affects both the size and the time course of the responses. Figure 2C shows the responses of the membrane to a current step for three values of the conductance g. If the conductance is very small, the response is slow, and there is high gain (that is, the voltage response to a given current is high). If the conductance g is very large (the membrane is very leaky), it has small gain and is fast in charging and discharging the capacitor.

The conductance of each cell is minimal in the absence of any visual stimulus, because all of the cells in the normalization pool are silent. The conductances are larger for a visual stimulus that is effective in driving the cells in the pool. This decreases the gain and the time constant of the cells in the pool so that they are more responsive and better able to follow the fine temporal changes of the stimulus.

The normalization pool. Our final assumption regards the composition of the normalization pool. We assume that the cells in the pool are tuned to all stimulus orientations and directions and to a broad range of spatial and temporal frequencies.

Solution of the model. The variables in the model depend on each other in a circular way: (1) the firing rate R of each cell depends on its membrane potential V (Eq. 3; Fig. 2A); (2) the membrane potential V of each cell depends on its driving current Id and on its conductance g (Eq. 2); and (3) the conductance g of each cell depends on Sigma  R, the total firing rate of the cells in the normalization pool (Eq. 4; Fig. 2B). This arrangement results in negative feedback, because increases in the overall response Sigma  R increase the conductance g, which in turn reduces the overall response Sigma  R. This guarantees that the conductance g remains finite (Sigma  R < 1/k in Eq. 4).

The model is a nonlinear neural network (Grossberg, 1988) and is in general quite complicated, because both the driving current and the conductance vary over time. Nevertheless, the model was designed so that for the visual stimuli used in this study---drifting sine gratings, plaids, and noise---we can derive approximate closed form equations for its responses. These equations, together with their derivation, are detailed in Appendix.


RESULTS

We report here on 149 data sets obtained from a total of 54 cells that were clearly identified as simple and were held long enough to be tested with at least two blocks of one of the core experiments in our protocol. In particular, we report on 51 grating matrix experiments from 34 cells, 76 plaid experiments from 27 cells, and 22 noise-masking experiments from 17 cells.

The cells in the sample exhibited a broad spectrum of tuning properties. The orientation tuning of the cells ranged from 14° to 124° half-width, with one-third of the cells showing a tuning sharper than 24° and one-third broader than 51°. The directional index of the cells (DI; Reid et al., 1987) ranged over the whole spectrum from 0 to 1. Direction selectivity was prominent (DI > 0.6) in about one-third of the cells.

Responses to gratings

Figure 3A shows the period histograms of the responses of a typical simple cell to drifting sinusoidal gratings with four different stimulus contrasts. Consistent with the linear model, the responses look like rectified sinusoids.
Fig. 3. Responses to drifting sine gratings of different contrasts. The curves are fits of the normalization model. The fits were performed on a larger data set, which included the responses to 72 different drifting gratings (8 contrasts, 3 orientations, and 3 temporal frequencies). A, Period histograms of the responses to four different contrasts. Scale bar in spikes per second. B, C, Response amplitude and phase as a function of contrast, computed from the first harmonic of the spike trains. D, Polar plot of the responses in B and C. Every point in the plot corresponds to a sinusoid with an amplitude that is given by the distance from the origin, and the phase of which is given by the angle with the horizontal axis. As the contrast increases the responses get larger (far from the origin), and their phases advance (they turn counterclockwise). Asterisks indicate the predictions of the normalization model at the different stimulus contrasts. Circles have radius 1 SEM (N = 3) computed from the estimated variance. Error bars in B and C are ±1 SEM, computed from circles in D. Cell 392l008 [directional index (DI) = 0.1; preferred spatial frequency (SF) = 0.9 cycles/°, stimulus size (SZ) = 4.5°], experiment 4. Parameters: tau 0 = 37 msec; tau 1 = 9 msec; n = 1.34.

[View Larger Version of this Image (23K GIF file)]


Dependence on contrast There are subtle aspects of the responses that are not consistent with a strictly linear model. One is response saturation (Maffei and Fiorentini, 1973; Dean, 1981a; Albrecht and Hamilton, 1982; Ohzawa et al., 1982; Li and Creutzfeldt, 1984; Sclar et al., 1990; Bonds, 1991; Carandini and Heeger, 1994). For a linear neuron, scaling stimulus contrast by a certain amount would scale the responses by the same amount. The responses of the cell in Figure 3, instead, increase only marginally as the contrast doubles from 0.5 to 1. Another nonlinearity is reflected in the latency of the responses. For a linear cell response latency would be unaffected by stimulus contrast. Simple cells, instead, display phase advance (Dean and Tolhurst, 1986; Carandini and Heeger, 1994; Albrecht, 1995); i.e., they respond sooner to high-contrast stimuli than to low-contrast stimuli. For example, the cell in Figure 3 responds ~20 msec sooner to the stimulus with unit contrast than to the stimulus with 0.12 contrast.

These effects on response size and latency are reflected in the amplitude and phase of the first harmonic of the responses (Fig. 3B,C). For contrasts <0.2 the amplitudes (Fig. 3B) grow roughly linearly with contrast (the slope in double logarithmic coordinates is close to 1), and the phases (Fig. 3C) stay substantially constant. As the contrast increases, the amplitudes saturate and the phases advance.

Figure 3D replots the data in the polar plane where response amplitude is represented as distance from the origin, and response phase is represented as the angle with the horizontal axis. As the contrast increases the data points get farther from the origin (response amplitude increases), and they turn counterclockwise (response phase advances).

The predictions of the normalization model are characterized by two equations, one for response amplitude and one for response phase. The best fit model parameters were determined by simultaneously fitting both the amplitude and phase of the responses. The model captures the saturation in response amplitude (Fig. 3B) because it postulates that increasing contrast increases the activity of the normalization pool, which increases the membrane conductance, and thus decreases the gain of the membrane. The model captures the advance in response phase, because the increase in membrane conductance decreases the time constant, so at high contrasts the membrane introduces shorter delays than at low contrasts. The fits provided by the normalization model are substantially more accurate than those provided by the linear model; according to the linear model the data in Figure 3B should lie on a diagonal line (no amplitude saturation), and the data in Figure 3C should lie on a horizontal line (no phase advance).

The equations for response amplitude and phase predicted by the model are derived in Appendix. We present here the equation for response amplitude, because it helps further illustrate the behavior of the model. According to the model, the amplitude of the responses R of a simple cell to a grating of contrast c and temporal frequency f is:
<UP>amplitude</UP>(R)=<FENCE><UP>amplitude</UP>(L)<FR><NU>c</NU><DE><RAD><RCD>&sfgr;(f)<SUP>2</SUP>+c<SUP>2</SUP></RCD></RAD></DE></FR></FENCE><SUP>n</SUP>, (5)
where the quantities L, sigma (f), and n are determined, respectively, by the linear, normalization, and rectification stages of the model (Fig. 1B). L is the response of the linear receptive field of the cell to the grating at unit contrast (Eq. 1). The normalization stage divides this quantity by <RAD><RCD><IT>&sfgr;(f)<SUP>2</SUP> + c</IT><SUP><IT>2</IT></SUP></RCD></RAD>, where sigma (f) grows with the temporal frequency f of the stimuli. Finally, n is the exponent of the rectification stage (Eq. 3; Fig. 2A).

The dependence of response amplitude on stimulus contrast is quite simple; at low contrasts, c <<  sigma (f), the denominator is approximately constant, and the responses grow as cn. At high contrasts, instead, the c in the denominator has a strong effect, and the responses saturate. Equation 5 is similar to a hyperbolic ratio, which was empirically found to provide good fits to the amplitude of the contrast responses of V1 cells (Albrecht and Hamilton, 1982; Sclar et al., 1990). Indeed, our ad hoc choice of the dependence of conductance on the activity of the normalization pool (Eq. 4) was made with this expression in mind.

Different orientations Figure 4 shows the contrast responses of a simple cell to two drifting gratings differing in their orientation. As shown in Figure 4A, the responses elicited by the grating drifting at -15° (left column) were ~40% larger than those elicited by the grating drifting at -45°. This proportion remained substantially constant in the face of prominent saturation above a contrast of 0.25.
Fig. 4. Responses to drifting sine gratings at two different orientations, -15° (gray) and -45° (white). Fits of the normalization model (curves) were performed on a larger data set than shown, which included 72 stimulus conditions (8 contrasts, 3 orientations, and 3 temporal frequencies). A, Period histograms. Rows correspond to different contrasts, columns to different orientations. B, C, Response amplitude and phase as a function of contrast. To facilitate comparison in C the responses to each grating were shifted vertically so that the values predicted by the model would overlap. D, Polar plot of the responses in B and C. Cell 392l009 (DI = 0.5; SF = 0.4; SZ = 2.2), experiment 8; N = 3. Parameters: tau 0 = 28 msec; tau 1 = 3 msec; n = 1.6.

[View Larger Version of this Image (24K GIF file)]


This property can be observed more precisely in Fig. 4B. The contrast responses obtained at the two different orientations are vertical shifts of each other on a logarithmic response scale, implying that the ratio of the responses to different orientations was constant, irrespective of the stimulus contrast. Another way to describe this behavior is to say that the orientation tuning scaled with contrast, a property that has been repeatedly observed for both orientation tuning and spatial frequency tuning (Movshon et al., 1978c; Albrecht and Hamilton, 1982; Sclar and Freeman, 1982; Li and Creutzfeldt, 1984; Skottun et al., 1987).

As with response saturation, phase advance was controlled by the contrast of the stimulus per se, rather than by the firing rate of the cell. Even though the absolute phases of the responses to the two gratings differed by about 180° (Fig. 4D) the relative timing of the responses (difference in response phase) was independent of stimulus contrast. This is illustrated in Fig. 4C, where the phases of the responses to each grating were shifted vertically so that the fits provided by the normalization model would overlap.

The curves predicted by the normalization model provided good fits to the data in Figure 4. Because saturation and phase advance depend on the stimulus contrast, and not on the size of the responses elicited in a cell, their presence is not simply the result of nonlinearities in the spike-encoding mechanism or in other attributes of a single cell. Rather, their presence indicates the existence of a contrast gain control mechanism in the visual cortex such as that described by the normalization model.

In fact, the model mandates the orientation invariances in the contrast responses, both in amplitude and in phase. In the expression for the response amplitude (Eq. 5), stimulus contrast and stimulus orientation are separable. The expression can be seen as the product of two factors, [amplitude(L)]n and (c/<RAD><RCD><IT>&sfgr;(f)<SUP>2</SUP> + c</IT><SUP><IT>2</IT></SUP></RCD></RAD>)n. The first factor depends on L, the response of the linear receptive field of the cell to the grating at unit contrast, so it depends on orientation but not on contrast. The second factor depends only on the contrast c and on the temporal frequency f of the grating. For a fixed temporal frequency the shape of the contrast responses is entirely controlled by this second factor, which is independent of stimulus orientation. A similar argument can be made for the phase responses predicted by the model: the expression for response phase (Appendix, Eq. 13) is the sum of two terms, one that depends on stimulus orientation but not on contrast, and one that depends on stimulus contrast but not on orientation.

Different spatial frequencies Changing the spatial frequency of a grating had the same effect on the contrast responses as changing orientation; response amplitude was shifted vertically on a logarithmic scale, and response phase was shifted vertically on a linear scale. Figure 5 shows an example in which the responses elicited by the 1.4 cycles/degree grating (Fig. 5A, left column) were ~70% larger than those elicited by the 1.1 cycles/degree grating (right column). This proportion held substantially constant in the face of response saturation. The fits of the normalization model (continuous curves) capture all these properties of the responses. Indeed, the very same argument about separability in the model responses of contrast and orientation can be made for contrast and spatial frequency.
Fig. 5. Contrast responses for gratings with two different spatial frequencies: 1.4 (gray) and 1.1 (white) cycles/degree. Fits of the normalization model (curves) were performed on a larger data set than shown, which included 40 stimulus conditions (10 contrasts, 2 spatial frequencies, and 2 temporal frequencies). Contrasts <0.12 elicited <1 spike/sec. A, Period histograms. Rows correspond to different contrasts, columns to different spatial frequencies. B, C, Response amplitude and phase as a function of contrast. Responses to each grating in C were shifted vertically so that their values predicted by the model would overlap. D, Polar plot of the responses in B and C. Cell 382l019 (DI = 0.8; SF = 1.4; SZ = 1.9), experiment 5; N = 6. Parameters: tau 0 = 18 msec; tau 1 = 8 msec; n = 4.

[View Larger Version of this Image (21K GIF file)]


Different temporal frequencies Changes in the stimulus temporal frequency had very different effects from changes in orientation or spatial frequency. In particular the above-mentioned invariances of the contrast responses did not hold for stimuli differing in temporal frequency. Rather, we found that increasing the temporal frequency increased the contrast at which the responses saturated and decreased the total phase advance. Similar results (for the amplitude of the responses) were obtained in the cat by Holub and Morton-Gibson (1981) and in the monkey by Hawken and collaborators (1992; also see Albrecht, 1995, Appendix).

Figure 6 illustrates these phenomena. At low temporal frequencies the responses saturated at low contrasts (Fig. 6A, left columns), but at high temporal frequencies they did not show much saturation (right columns). This behavior can be better observed in an amplitude plot (Fig. 6B); the contrast responses differ in their horizontal position, so they could not be superimposed by a vertical shift, as was the case with the contrast responses to different orientations or spatial frequencies.


Fig. 6. Dependence of the contrast responses on temporal frequency. Curves are predictions of normalization model. A, Period histograms. Rows correspond to different contrasts, columns to different temporal frequencies. B, Response amplitude as a function of contrast. The 3.3 Hz data were very close to the 1.6 Hz data and were omitted to avoid clutter. C, Response phase as a function of contrast. Gray levels indicate the temporal frequency as in A. D, Response amplitude as a function of temporal frequency and contrast. Dashed lines connect actual data (dots); continuous lines indicate fits of the model. Fits were performed on a larger data set than shown, which included 64 stimulus conditions (8 contrasts, 4 temporal frequencies, and 2 orientations). Cell 382l021 (DI = 0.1; SF = 1.4; SZ = 7.5), experiment 5; N = 3. Parameters: tau 0 = 66 msec; tau 1 = 8 msec; n = 4.

[View Larger Version of this Image (39K GIF file)]


The effect of temporal frequency on the contrast responses can be rephrased in terms of the effect of contrast on the temporal frequency tuning. Increasing stimulus contrast increased the responsivity of the cells to the high temporal frequencies. This phenomenon is most visible in Figure 6D, which can be seen as a set of temporal frequency curves measured at different contrasts. Although at low contrasts the cell was essentially low-pass, at high contrasts the cell was mildly bandpass, with the 6.5 Hz stimulus eliciting 46% stronger responses than the 1.6 Hz stimulus. From the quality of the fits it is clear that the normalization model captures this behavior. The linear model, on the other hand, predicts that increasing the contrast should just scale the responses, with no effect on the temporal frequency tuning.

The effect of contrast on the temporal frequency tuning of the normalization model can be understood by observing the effects of changing the conductance on the temporal frequency tuning of an RC circuit (Fig. 7). Increases in conductance reduce the gain of the membrane more at low frequencies than at high frequencies, substantially increasing the cutoff frequency of the membrane. Because the conductance grows with stimulus contrast, at low contrasts the cutoff frequency of the membrane is low, and the low-pass character of the membrane dominates the responses. At higher contrasts the cut-off frequency of the membrane is higher, and the tuning of the responses is determined by the linear receptive field providing input to the membrane. In the case of the cell in Fig. 6, the fits of the model indicate that the tuning of the linear receptive field was bandpass.


Fig. 7. Effects of changing the conductance g = 1/R in an RC circuit. Circuit parameters, and their dependence on contrast, are estimated from the experiment in Figure 6. Continuous curves show the transfer function at rest (low conductance); dashed curves show the transfer function at unit contrast (high conductance). Arrows indicate decrease in gain (top) and phase advance (bottom) at four temporal frequencies (1.6, 3.3, 6.5, and 13 Hz).

[View Larger Version of this Image (14K GIF file)]


Figure 7 also illustrates an example of how phase advances in an RC circuit with increased conductance. The vertical arrows in the bottom panel of Figure 7 indicate the total phase advance predicted by the model at the four temporal frequencies tested in the experiment of Figure 6. The best fit model parameters predict that phase advance between zero and unit contrast is largest for the 6.5 Hz stimulus (51.9°), marginally smaller for the 3.3 and 13 Hz stimuli (44.4° and 46.9°), and smaller still for the 1.6 Hz stimulus (29.5°). The expression for the total phase advance predicted by the model is:
<UP>phase advance</UP>=<UP>arctan</UP>(2&pgr;f&tgr;<SUB>0</SUB>)−<UP>arctan</UP>(2&pgr;f&tgr;<SUB>1</SUB>), (6)
where f is the stimulus temporal frequency, and tau 0 and tau 1 are, respectively, the time constant of the membrane at 0 and at unit contrast. The maximal phase advance is achieved at a frequency equal to 1/(2pi <RAD><RCD>&tgr;<SUB>0</SUB>&tgr;<SUB>1</SUB></RCD></RAD>).

The data in Figure 8 exemplify the dependence of phase advance on temporal frequency. For this cell the best fit model parameters predict that the phase advance should be minimal (11.3°) at 1.6 Hz and increase with temporal frequency: 20.77° at 3.3 Hz, 31.8° at 6.5 Hz, and 35.7° at 13 Hz. The data clearly confirm this trend, which was typical of our sample. Indeed, most of the figures in this study display data acquired with temporal frequencies of ~6 Hz. We wanted to provide examples of contrast responses showing clear saturation and clear phase advance. As predicted by the model, we found that temporal frequencies <3 Hz yielded strong saturation but little phase advance, whereas temporal frequencies much >6 Hz showed large phase advances but little saturation.


Fig. 8. Phase advance and temporal frequency. Curves are predictions of normalization model. A, Period histograms. Rows correspond to different contrasts, columns to different temporal frequencies. B, Response phase as a function of contrast. Gray levels indicate the temporal frequency as in A. Fits were performed on a larger data set than shown, which included 60 stimulus conditions (5 contrasts, 4 temporal frequencies, and 3 spatial frequencies). Cell 392l008 (same as Fig. 3), experiment 7; N = 3. Parameters: tau 0 = 27 msec; tau 1 = 7 msec; n = 1.2.

[View Larger Version of this Image (16K GIF file)]


The increase in phase advance with increasing temporal frequency can also be seen as a decrease in integration time, the slope of a line fitted to a phase versus temporal frequency plot of the data. A similar phenomenon---together with dramatic changes in the temporal frequency tuning of the cells---was observed in cat by Reid et al. (1992) using broad-band high-energy stimuli. The authors of that study pointed out that these behaviors could be explained by changes in the membrane conductance of cortical cells. The normalization mechanism that we propose works exactly that way, and indeed we have shown that it predicts effects similar to those observed by Reid and collaborators (Carandini and Heeger, 1993).

An entire data set The curves predicted by the model illustrated in the preceding figures were the result of fits to entire data sets, not just to the data appearing in the figures. For example, the responses in Figure 3 were obtained in a grating matrix experiment that included 72 different drifting gratings, with eight different contrasts, three different orientations, and three different temporal frequencies. The full set of responses to these stimuli are shown in Figure 9. This example illustrates the principal properties of the contrast responses; changing orientation shifts the amplitude responses vertically on a logarithmic scale and the phase responses vertically on a linear scale. Amplitude saturation is more prominent at low temporal frequencies; phase advance is more prominent at higher temporal frequencies.
Fig. 9. An entire grating matrix data set. The cell was tested with three different temporal frequencies (A, 3.3 Hz; B, 6.6 Hz; C, 13 Hz), three different orientations (white, 120°; gray, 80°; black, 40°), and nine different contrasts. Some period histograms for these responses are shown in Figure 3A. The shapes of the 18 curves are determined by only 3 parameters: tau 0 = 37 msec; tau 1 = 9 msec; n = 1.34. Eighteen additional parameters determine the vertical positions of the eighteen curves. Cell 392l008, experiment 4; N = 3.   

[View Larger Version of this Image (31K GIF file)]


The 18 curves predicted by the normalization model (9 for amplitude and 9 for phase) provide satisfactory fits to the data. Whereas the vertical position of each curve depends on the linear stage of the model, the shape of all the curves (including their horizontal position) depends on the normalization and rectification stages. In particular, the vertical position of each curve is determined by one parameter, corresponding to the amplitude or phase of the response of the linear stage to each grating at full contrast. The shape and horizontal position of all the curves, instead, are determined by a total of three parameters. The first two are the time constants tau 0 and tau 1 of the membrane at rest and at full contrast; these characterize the normalization stage and [by determining sigma (f)] control the horizontal position of the amplitude curves and the steepness of the phase curves. The third parameter is the exponent n, which characterizes the rectification stage. It controls the steepness of the amplitude curves below saturation, and has no effect on the phase curves.

Responses to plaids

We now consider the responses to a wider set of visual stimuli: plaids composed of two drifting gratings having the same temporal frequency. The gratings differed in orientation and/or in spatial frequency, and their contrasts c1 and c2 assumed a variety of different values.

Cells in the cat primary visual cortex display a phenomenon known as "cross-orientation inhibition" (Morrone et al., 1982; Bonds, 1989; Gizzi et al., 1990), in which the responses to optimal stimuli are inhibited by the presence of stimuli of nonoptimal orientation, which would elicit negligible responses if presented alone. More generally, there are numerous reports of conditions in which cells in the cat visual cortex are inhibited by stimuli that elicit no response when presented alone. This inhibition has been found to be independent of direction of motion, largely independent of orientation, and broadly tuned for spatial and temporal frequency (Bishop et al., 1973; Dean et al., 1980; Burr et al., 1981; Hammond and MacKay, 1981; Morrone et al., 1982; De Valois and Tootell, 1983; Kaji and Kawabata, 1985; Gulyas et al., 1987; Bonds, 1989; Nelson, 1991; DeAngelis et al., 1992; Geisler and Albrecht, 1992). Cross-orientation inhibition can be elicited with one grating in each eye, although suppression with both gratings in the same eye is typically stronger (Ferster, 1981; Ohzawa and Freeman, 1986a,b; Freeman et al., 1987; DeAngelis et al., 1992; Sengpiel and Blakemore, 1994; Sengpiel et al., 1995; Walker et al., 1996).

Our results indicate that cross-orientation inhibition is present in most cells of the monkey primary visual cortex. An example of this is shown in Figure 10, which shows the responses of a simple cell to a plaid with components that drifted in orthogonal directions. Although one of the gratings (grating 1) was quite effective in driving the cell (Fig. 10A, left column), the other (grating 2) elicited almost no spikes when presented alone (top row). Its presence, however, clearly suppressed the responses to the first grating. The inhibitory effect of the second grating can be observed more precisely in Figure 10B, which shows the contrast responses of the cell for four different contrasts of grating 2. As observed by Bonds (1989) in the cat, the presence of the second grating shifts the contrast response to the right on a logarithmic scale. This shift to the right would not be explained by the linear model; if cross-orientation inhibition were attributable to a linear interaction between two (possibly subthreshold) linear responses, it would subtract from the responses a fixed quantity. The responses to the first grating would saturate at the same contrast, irrespective of the contrast of the second grating. As shown in Figure 10, this is not the case.


Fig. 10. Masking by an orthogonal grating. Responses to a plaid experiment in which one component was nearly optimally oriented (grating 1), and the other was orthogonal and ineffective in driving the cell when presented alone (grating 2). Curves are fits of the normalization model. A, Period histograms for different contrasts of the components. Rows, Different contrasts of grating 1 (c1). Columns, Different contrasts of grating 2 (c2). As c2 was increased, the responses decreased in size (cross-orientation inhibition). B, Response amplitude as a function of c1, for different values of c2 (white to black: 0.06, 0.12, 0.25, and 0.5). As c2 increased, the contrast responses shifted to the right; more and more contrast of grating 1 was needed to maintain a set level of firing. C, Same data, plotted as a function of c2, for different values of c1 (white to black: 0, 0.06, 0.25, and 0.5). Cell 392l024 (DI = 0.4; SF = 0.1; SZ = 6.8), experiment 9; N = 3. Parameters: tau 0 = 158 msec; tau 1 = 5 msec; n = 2.3.

[View Larger Version of this Image (20K GIF file)]


The shift to the right of the contrast responses corresponds to an effective scaling of stimulus contrast. This is the behavior predicted by the normalization model (Heeger, 1992b), which, as illustrated by the curves in Figure 10, provided good fits to our plaid data. Approximate equations for the amplitude and phase of the responses of the model to plaids are derived in Appendix. The expression for response amplitude is:
<UP>amplitude</UP>(R) ∝ <FENCE><FR><NU><UP>amplitude</UP>(c<SUB>1</SUB>L<SUB>1</SUB>(t)+c<SUB>2</SUB>L<SUB>2</SUB>(t))</NU><DE><RAD><RCD>&sfgr;(f)<SUP>2</SUP>+c<SUP>2</SUP><SUB>1</SUB>+c<SUP>2</SUP><SUB>2</SUB></RCD></RAD></DE></FR></FENCE><SUP>n</SUP>, (7)
where c1 and c2 are the contrasts of the two gratings, L1(t) and L2(t) are the responses of the linear receptive field to the individual gratings at unit contrast, and the remaining symbols have the same meaning as in the expression for the response to individual gratings (Eq. 5). Since the receptive field of the cell is linear, its response to the plaid is just a linear combination of its responses to the individual gratings, c1L1(t) + c2L2(t). The normalization stage divides that by approximately <RAD><RCD><IT>&sfgr;(f)<SUP>2</SUP> + c</IT><SUB><IT>1</IT></SUB><SUP><IT>2</IT></SUP><IT> + c</IT><SUB><IT>2</IT></SUB><SUP><IT>2</IT></SUP></RCD></RAD> (see Appendix). If, as in Figure 10, grating 2 alone does not elicit any response (L2 approx  0), then the effect of an increase of c2 in the denominator is to shift the contrast response to the right on the log contrast axis (Heeger, 1992b).

The pure rightward shift of the contrast responses occurs only when the cell is completely unresponsive to the masking grating. When each grating in the plaid elicits (even minimal) responses when presented alone, their combined effect is more complicated. In this case the sinusoidal responses of the linear receptive field to the individual gratings are added together before the normalization stage. Depending on their relative phase they can add constructively or destructively. An example of this is shown in Figure 11. The top and bottom rows in Figure 11A show the period histograms of the responses of a cell to two gratings of different spatial frequency. Both gratings elicited strong responses, with phases differing by approximately 90°. The responses to the "plaids" obtained by summing the gratings are shown in the middle row.


Fig. 11. Responses to the sum of two equally effective stimuli. Components differed in spatial frequency. Continuous curves are fits of the normalization model. A, Period histograms for four different contrasts. Dark gray, Responses to grating 1 (1.2 cycles/degree). White, Responses to grating 2 (0.6 cycles/degree). Light gray, Responses to the sums of the stimuli in the top and bottom rows. B, Polar plot of the first harmonic responses. Squares indicate the vectorial sums of the responses to the individual gratings (dark gray and white data points). The actual responses to the plaid were smaller (closer to the origin) and occurred slightly sooner (counterclockwise) than this linear prediction. Cell 385r037 (DI = 0.9; SF = 0.8; SZ = 1.7), experiment 05; N = 4. Parameters: tau 0 = 45 msec; tau 1 = 2 msec; n = 2.44.

[View Larger Version of this Image (15K GIF file)]


The sum of sinusoids is best understood in a polar plot (Fig. 11B), in which every sinusoid corresponds to a vector, and the sum of sinusoids is just a sum of vectors. The dark gray data points are the responses to grating 1; the white data points are the responses to grating 2. The light gray data points are the responses to the plaid obtained by superimposing the two gratings. The squares indicate the linear predictions for the plaid responses obtained by summing (vectorially) the responses to the individual gratings. The actual plaid responses show more saturation (they remain closer to the origin) than these linear predictions. They also occur earlier (their angle with the horizontal axis is larger) than the linear predictions. Although not perfect, the fits of the normalization model (continuous curves) capture both phenomena. This is because the local stimulus energy of the plaid is greater than that of the individual gratings. In the model this results in higher membrane conductance, which causes a decrease in gain and time constant.

Figure 12 illustrates another example of plaid responses. In this case two orthogonal gratings were able to drive the cell. Grating 2 was not as effective as grating 1, but it did elicit some spikes when presented alone. The dependence of the responses on the contrasts of the gratings is complicated: depending on the contrast of grating 1, increasing the contrast of grating 2 either enhanced or suppressed the responses. This behavior would be hard to explain at the level of a single cell. Instead, as shown by the continuous curves fit to the responses, it is precisely predicted by the normalization model. The contrasts of the two gratings, c1 and c2, appear both in the numerator and in the denominator of Equation 7. Increasing one of the two can result either in an enhancement or in a reduction in the response, depending on the amplitudes and phases of the underlying linear responses L1 and L2.


Fig. 12. Masking with a grating that is effective in driving the cell. Responses to a plaid experiment in which one component was nearly optimally oriented (grating 1), and the other was orthogonal but still elicited some response when presented alone (grating 2). A, Period histograms for different contrasts of the components. Rows, Different contrasts of grating 1 (c1). Columns, Different contrasts of grating 2 (c2). When presented alone, grating 1 elicited strong responses (left column), grating 2 weak responses (top row). B, Response amplitude as a function of c2, for different values of c1 (white to black: 0, 0.06, 0.12, and 0.5). Increasing c2 increased the size of the responses when grating 1 was absent; it inhibited the responses for intermediate contrasts of grating 1, and it had little effect for high contrasts of grating 1. Cell 392r013 (DI = 0.9; SF = 0.4; SZ = 3.4), experiment 12; N = 3. Parameters: tau 0 = 136 msec; tau 1 = 1.4 msec; n = 2.22.

[View Larger Version of this Image (17K GIF file)]


Figure 13 illustrates the responses of the same cell to different plaids. The top panel in Figure 13A replots the amplitude data of Figure 12, and the bottom panel shows the corresponding phase data, illustrating that increasing the contrast of either grating resulted in phase advance. In Figure 13A grating 2 drifted at 90° with respect to grating 1, and it elicited responses that were smaller by about a factor of five. When grating 2 was replaced by one drifting at 30° with respect to grating 1, it elicited responses that were only marginally smaller than those to grating 1 (Fig. 13B, top panel). The phases of the responses to the two individual gratings were almost opposite (Fig. 13B, bottom panel), ~0° for grating 1 and ~135° for grating 2. As a result the two stimuli interacted destructively, as witnessed by the dip in the diagonal region of the top panel in Figure 13B. In that region increasing the contrast of any of the two gratings reduced the amplitude of the responses. The model clearly captures this phenomenon, which is principally attributable to its linear stage. When the spatial phase of grating 2 was changed by 90° (Fig. 13C), this phenomenon disappeared. Now increasing the contrast of either grating increased the size of the responses.


Fig. 13. Amplitude (top) and phase (bottom) of the responses of a cell to