Previous Article | Next Article 
Volume 17, Number 21,
Issue of November 1, 1997
pp. 8621-8644
Copyright ©1997 Society for Neuroscience
Linearity and Normalization in Simple Cells of the Macaque
Primary Visual Cortex
Matteo Carandini1,
David J. Heeger2, and
J.
Anthony Movshon1
1 Howard Hughes Medical Institute and Center for Neural
Science, New York University, New York, New York 10003, and
2 Department of Psychology, Stanford University, Stanford,
California 94305
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
MODEL
RESULTS
DISCUSSION
FOOTNOTES
APPENDIX
REFERENCES
ABSTRACT
Simple cells in the primary visual cortex often appear to compute a
weighted sum of the light intensity distribution of the visual stimuli
that fall on their receptive fields. A linear model of these cells has
the advantage of simplicity and captures a number of basic aspects of
cell function. It, however, fails to account for important response
nonlinearities, such as the decrease in response gain and latency
observed at high contrasts and the effects of masking by stimuli that
fail to elicit responses when presented alone. To account for these
nonlinearities we have proposed a normalization model, which extends
the linear model to include mutual shunting inhibition among a large
number of cortical cells. Shunting inhibition is divisive, and its
effect in the model is to normalize the linear responses by a measure
of stimulus energy. To test this model we performed extracellular
recordings of simple cells in the primary visual cortex of anesthetized
macaques. We presented large stimulus sets consisting of (1) drifting
gratings of various orientations and spatiotemporal frequencies; (2)
plaids composed of two drifting gratings; and (3) gratings masked by full-screen spatiotemporal white noise. We derived expressions for the
model predictions and fitted them to the physiological data. Our
results support the normalization model, which accounts for both the
linear and the nonlinear properties of the cells. An alternative model,
in which the linear responses are subject to a compressive
nonlinearity, did not perform nearly as well.
Key words:
visual cortex;
contrast;
nonlinearity;
gain control;
normalization;
masking;
noise
INTRODUCTION
A longstanding view of simple cells
in the primary visual cortex is that they compute a weighted sum of the
light intensities falling on their receptive field (Hubel and Wiesel,
1962
; Movshon et al., 1978a
; Carandini et al., 1997b
). This
linear model is depicted in Figure
1A and is usually taken
to include a rectification (thresholding) stage to account for the
transformation of intracellular signals into firing rates.
Fig. 1.
Two models of simple cell function. A,
The linear model, composed of a linear stage
(receptive field) and a rectification stage. The linear
stage performs a weighted sum of the light intensities over local space
and recent time. This sum is converted into a positive firing rate by
the rectification stage. Rectification is a nonlinearity, so the
"linear model" is not entirely linear. B, The
normalization model extends the linear model by adding a
divisive stage. The linear stage feeds into a circuit composed of a
resistor and a capacitor in parallel (RC circuit). The
conductance of the resistor grows with the pooled output of a large
number of cortical cells. This effectively divides the output of the linear stage.
[View Larger Version of this Image (25K GIF file)]
Although many aspects of simple cell responses are consistent with the
linear model, there also are important violations of linearity. For
example, scaling the contrast of a stimulus would identically scale the
responses of a linear cell. At high contrasts, however, the responses
of simple cells show clear saturation (Maffei and Fiorentini, 1973
).
Moreover, simple cells are subject to cross-orientation inhibition; the
responses to an optimally oriented stimulus can be diminished by
superimposing an orthogonal stimulus that is ineffective in driving the
cell when presented alone (Morrone et al., 1982
; Bonds, 1989
; Bauman
and Bonds, 1991
).
According to a view that has emerged in recent years, the
nonlinearities of simple cells could be explained by extending the linear model to include a gain control stage (Albrecht and Geisler, 1991
; Heeger, 1991
, 1992b
, 1993
; DeAngelis et al., 1992
; Carandini and
Heeger, 1994
; Nestares and Heeger, 1997
; Tolhurst and Heeger, 1997a
,b
).
In particular, one of us (Heeger, 1991
, 1992b
) proposed a
normalization model (Fig. 1B), in which
the linear response of every cell is divided (or "normalized") by a
number that grows with the activity of a large number of cortical
cells, the normalization pool. The normalization model
attributes the selectivity of a cell to the initial linear stage and
its nonlinear behavior to the division stage. For example, the model
predicts response saturation because the divisive suppression increases
with stimulus contrast, and the model predicts cross-orientation
inhibition because the normalization pool includes neurons with a wide
variety of tuning properties, many of which respond to orthogonal
gratings.
Previously, we have suggested a possible biophysical
implementation of the normalization model (Fig. 1B)
(Carandini and Heeger, 1994
). The cell membrane is modeled as an
RC circuit, composed of a resistor and a capacitor in
parallel. The linear stage injects synaptic current into the cell, and
normalization operates by controlling the conductance of the resistor,
i.e., the membrane conductance. The cells in the normalization pool
effectively inhibit each other by increasing the membrane conductance
of each other. This shunting inhibition controls the gain of
the transformation of input current to output membrane potential. A
rectification stage converts the latter into a firing rate.
To test this model against large data sets obtained in monkey primary
visual cortex, we recorded the responses of simple cells in area V1 of
paralyzed, anesthetized macaques, while presenting a variety of visual
stimuli. These stimuli included drifting gratings, plaids composed of
two drifting gratings, and drifting gratings superimposed on
full-screen spatiotemporal white noise. The gratings had a wide range
of contrasts, temporal frequencies, spatial frequencies, and
orientations. We derived equations for the model responses to such
stimuli, and we found that these equations provided good fits to the
neural responses.
Portions of this work have been presented briefly elsewhere (Carandini
and Heeger, 1994
, 1995
).
MATERIALS AND METHODS
Experiments were performed on five cynomolgus macaque monkeys
(Macaca fascicularis) and four pigtail macaque monkeys
(M. nemestrina) ranging in weight from 1.5 to 4 kg.
Preparation and maintenance
Animals were initially anesthetized with ketamine HCl (10 mg/kg)
and premedicated with atropine sulfate (0.05 mg/kg) and acepromazine maleate (0.1 mg/kg). Anesthesia continued on 1.5-2.0% halothane in a
98% O2-2% CO2 mixture while the initial
surgery was performed. Indwelling catheters were introduced into the
saphenous veins of each hindlimb, and a tracheotomy was performed.
The animal was then mounted in a stereotaxic instrument, and halothane
anesthesia was replaced by a continuous infusion of sufentanil citrate
(typically 4-6
µg·kg
1·hr
1, beginning
with a loading dose of 4 µg/kg). EEG, ECG, and arterial blood
pressure were monitored continuously, and any signs of arousal were
corrected by modifying the rate of anesthetic infusion. The monkey was
artificially respirated with a mixture of O2,
N2O, and CO2 adjusted so that end-tidal
CO2 was maintained at 3.8-4.0%. Rectal temperature was
kept near 37°C with a heating pad.
A small craniotomy was performed, usually 9-10 mm lateral to the
midline and 3-4 mm posterior to the lunate sulcus. This location often
yielded two encounters with the primary visual cortex, with eccentricities first at ~2-5° and then at ~8-15°. A small
slit in the dura was made, and a vertical hydraulic microdrive
containing a glass-coated tungsten microelectrode (Merrill and
Ainsworth, 1972
) in a guide tube was positioned. The craniotomy was
covered with a chamber containing 4% agar in sterile saline
solution.
On completion of surgery, animals were paralyzed to minimize eye
movements. Paralysis was maintained with an infusion of vecuronium bromide (Norcuron, 0.1 mg·kg
1·hr
1) in lactated
Ringer's solution with dextrose (5.4 ml/hr). The pupils were dilated
and accommodation paralyzed with topical atropine. The corneas were
protected with zero power gas-permeable contact lenses; supplementary
lenses were chosen to focus the eyes on a tangent screen plotting table
set up at a distance of 57 in. To maintain the animal in good
physiological condition during experiments (typically 72-96 hr),
intravenous supplementation of 2.5% dextrose/lactated Ringer's was
given at 5-15 ml/hr. Animals received daily injections of a
broad-spectrum antibiotic (Bicillin) as well as an anti-inflammatory
agent (dexamethasone) to prevent cerebral edema.
Stimuli
Stimuli were generated by a Truevision ATVista board operating
at a resolution of 582 × 752 and a frame rate of 106 Hz, the output of which was directed to a Nanao T560i monitor (mean luminance, 72 cd/m2, subtending 10-25° of visual angle).
Nonlinearities in the relation between applied voltage and phosphor
luminance were compensated by appropriate look-up tables. Stimulus
strength is measured in units of contrast, defined as the
difference between the highest and lowest intensities, divided by the
sum of the two.
Drifting luminance-modulated sinusoidal gratings were presented alone
or superimposed on another grating or on a noise background. Superposition was obtained by interleaving, i.e., by presenting the two
components in alternate frames. When two gratings were presented
together they had the same temporal frequency and differed in
orientation and/or spatial frequency. Their contrast could be varied
independently. The noise background was composed of square pixels, the
size of which was chosen for each cell to be approximately one-fourth
of the spatial period of the optimal grating. Occasionally we used
one-dimensional noise (bars rather than squares). The intensity of each
square was randomly refreshed at 13.4 or 26.8 Hz and assumed one of two
possible values.
All the stimuli had the same mean luminance. The grating and plaid
stimuli were vignetted by a square window, the size of which was chosen
to elicit the maximal responses. The noise masks occupied the whole
screen. In their absence the surrounding field was uniform.
Experiments. Experiments consisted of two to nine
consecutive blocks of stimuli. Each block consisted of a
random permutation of 5-90 stimuli. Randomization was adopted to
minimize the effects of adaptation and other nonstationarities. The
stimuli had equal duration (generally 5-10 sec) and were separated by
uniform field presentations lasting about 4 sec.
Experimental protocol. Receptive fields were initially
mapped by hand on a tangent screen. When the activity of a single
neuron was isolated, we established the dominant eye of the neuron and occluded the other eye. We then positioned the receptive field on the
face of the monitor, and quantitative experiments proceeded under
computer control.
To characterize each cell we performed the following sequence of
measurements using single gratings: (1) orientation and direction tuning; (2) spatial frequency tuning; (3) temporal frequency tuning; and (4) stimulus size tuning. Each of these measurements was performed at the optimal values of the parameters as obtained from the previous measurements. Cells were classified as simple or complex on the basis
of the frequency component of their response to the drifting grating
eliciting the maximum number of spikes, as classified by Skottun et al.
(1991)
. If the cell was simple we proceeded to the core experiments in
this study. These were of three types:
(1) Grating matrix experiments, consisting of drifting
sinusoidal stimuli having 5-10 different contrasts, two to four
different temporal frequencies, and two to four different orientations
or spatial frequencies. A typical experiment would involve three orientations or spatial frequencies, three temporal frequencies, and
five contrasts, yielding a total of 45 stimuli.
(2) Plaid experiments, consisting of sums of two gratings
with contrasts that were independently varied. Often the two directions were opposite, and the "plaid" was a counterphase flickering
grating. A typical experiment would involve two orthogonal gratings
with contrasts that assumed five possible values, yielding a total of
25 different stimuli.
(3) Noise-masking experiments, in which the contrast
response to drifting gratings was measured in the presence of noise at different contrasts. A typical experiment would involve nine grating contrasts and two noise contrasts (0 and 0.5), yielding a total of 18 different stimuli.
Data analysis
Amplified and bandpass-filtered signals from the microelectrode
were fed into a hardware window discriminator. A computer interface
(Cambridge Electronic Design 1401 Plus) collected the pulses triggered
by each action potential and the synchronization signals from the video
graphics board.
Response measure. Our measure of cell response is the
first harmonic r of the spike trains, a complex number
indicating the amplitude and phase of the best-fitting sinusoid having
the same temporal frequency as the stimulus. This number is obtained
from the spike train by computing r = (1/D)
k cos(2
ftk) + i sin(2
ftk), where
D is the stimulus duration, f is the temporal
frequency of the stimulus, and the tk are the
times of the individual spikes. The amplitude of the first harmonic has
units of spikes per second. The responses r obtained in an
experiment constitute a matrix r = {rs,b}, where the subscripts
indicate the sth stimulus presented in the bth
stimulus block. We denote the mean across blocks of the responses as
the vector
= {rs}. For example, in an
experiment in which three blocks of 25 different stimuli were run, the
matrix r would contain 75 elements, and the vector
would contain 25 elements.
Correction for eye movements. Inspection of the spike
rasters often revealed a few discrete misalignments across stimulus blocks in the responses to individual stimuli, which are best explained
by the presence of small eye movements. For drifting grating stimuli
the sole effect of these eye movements would be a shift in response
timing. We reduced this effect by shifting in time all the responses in
each block by an amount chosen to minimize
s2, the variance across blocks of the
responses. Because all the responses in a block are translated by the
same amount, this method would completely remove the effect of the
movements only if they occurred exactly between blocks. In all other
cases it is just an approximation that reduces the variance of the
data. No attempt was made to correct the effect of possible eye
movements on the responses to plaids or to gratings in the presence of
noise.
Estimation of the variance. The number of blocks in our
experiments (two to nine) was not sufficient to obtain reliable
estimates of the variance
s2 of the responses
to each stimulus s. For this reason we estimated the
dependence of
s2 on
|rs|, the amplitude of the mean
responses. As a functional form for this dependence we chose the simple
relation
s2 =
|rs|
, where
and
are free parameters. This expression provided very good fits to
the data. In the fits, the scale factor
was on average 2.11 ± 0.18, and the exponent
was on average 1.18 ± 0.02, consistent
with previous findings that the variance of the responses of V1 neurons
is proportional to their mean (Dean, 1981b
; Tolhurst et al., 1983
;
Bradley et al., 1987
; Vogels et al., 1989
).
Model fits. The models discussed in Results were fit to the
responses to all stimuli in an experiment. Different experiments were
fitted independently and thus yielded different sets of parameters. To
fit the predictions of a model m = {ms} to the data we performed a
weighted least squares fit; i.e., we searched for the parameters
a that minimized the error function
where the
s2 are the estimated
variances. To avoid giving too much importance to data points of low
amplitude, when fitting the models of the visual responses we took all
the
s2 < 1 to be equal to 1.
Percentage of the variance. To gain an intuitive assessment
of the quality of the fits provided by a model, we computed the percentage of the variance across stimuli for which the model accounted. To define this measure it is useful to consider the (mean
square) distance between two sets of responses x = {xs} and y = {ys}:
where the sum is over the stimuli s, and N
is the number of stimuli. The percentage of the variance accounted for
by the model may then be expressed as:
where
is the response mean computed across
stimuli and across blocks. In this expression, the numerator is the distance between the model predictions and the mean cell responses; the
denominator is the variance across stimuli of the mean cell responses.
For example, if the model predicts the mean responses exactly, then it
accounts for 100% of the variance. More realistically, if the mean
error between the model predictions and the responses is
d(m,
) = 10 spikes/sec, and the
responses in the data set have very different amplitudes and/or phases,
so that their variance is large, say d(
,
) = 100 spikes/sec, then the model accounts for 90%
of the variance in the data.
Bootstrap test. Although the percentage of the variance is
an intuitive measure of the quality of the fits, it has the
disadvantage of taking into account only the variability across stimuli
and not the variability across blocks. If a cell were very noisy, our
experiments would yield bad estimates of its mean responses rs; in this case the model would account
for a small percentage of the variance in the data even if it reflected
the exact physical reality underlying the responses. To test the
quality of the model predictions taking into account all the
statistical properties of the data, we performed a bootstrap hypothesis
test (Efron and Tibshirani, 1991
). The advantage of bootstrapping is
that it does not assume that the response variability follows a
particular (e.g., Gaussian) distribution.
We tested whether we could reject the null hypothesis that the mean of
the probability distribution underlying the neural responses was
identical to the predictions of the model. Let
rb be the vector of responses obtained in the
b-th block of stimuli. If for example an experiment involved
25 different stimuli and was repeated four times, there would be four
vectors of responses, r1,
r2, r3, and
r4, and each would contain 25 elements.
Let m be the prediction of the model obtained by fitting all
the rb. The null hypothesis states that the mean
µr of the probability distribution from which the rb are drawn is identical to the prediction of
the model:
As a test statistic we chose the distance between the model
predictions and the empirical average of the responses:
Having observed a value tobs by
evaluating the test statistic on the actual experimental data, we
calculated the probability of observing at least that large a value if
the null hypothesis were true. This probability is the achieved
significance level (ASL) of the test:
The smaller the ASL, the stronger the evidence against
H0.
To compute the ASL with the bootstrap method, we converted our data set
r into one with an empirical distribution function that
obeyed H0. This was simply done by shifting the data so
that the mean responses were exactly equal to the model predictions,
= r
+ m (Efron and Tibshirani, 1993
). We then computed the
bootstrap estimate of the ASL by repeating the following steps 1000 times: (1) Draw a sample data set r* with replacement from
. For example, if the experiment was repeated four
times, a possible draw would be r* = {
4
1
2
2};
another one could be r* = {
2
1
2
3},
and so on. (2) Compute the test statistic on the sample, t* = d(m,
*).
The bootstrap estimate of the achieved significance level of the test
is equal to the percentage of samples for which the t*
values are larger than the observed value
tobs.
MODEL
The normalization model is depicted in Figure
1B. To keep the model mathematically tractable, we
adopt a number of simplifications. To begin, we define the
driving current of a simple cell to be the current that
would be measured by clamping the voltage of the cell at rest. Then we
assume that (1) the relation between the visual stimuli and the driving
current is linear; (2) the cell membrane is a single passive
compartment; (3) the firing rate is a rectified copy of the membrane
potential; (4) cells inhibit each other (possibly through inhibitory
interneurons) by increasing the membrane conductance of each other; and
(5) the pool of cells that inhibit each other contains cells tuned to a
wide variety of stimulus attributes.
The linear stage. As a visual stimulus is projected on the
retina it can be described by its light distribution,
l(x,y,t), which varies in the two spatial dimensions
x,y and in time t. This representation ignores
the color of the stimulus and assumes monocular viewing. The light
distributions of the stimuli used in this study modulated about a fixed
mean
. In these conditions the output of the retina
is to a first approximation proportional to the local
contrast, c(x,y,t) = [l(x,y,t)
]/
(Shapley and Enroth-Cugell, 1984
). We will use
the term contrast and the symbol c (without
arguments) to denote the maximal value of the local contrast
c(x,y,t). A uniform field has zero contrast,
whereas a grating modulating between zero and twice its mean intensity has unit contrast.
We consider the driving current in simple cells to be linearly related
to the output of the retina and thus to the local contrast. The driving
current Id(t) is obtained by
weighting the local stimulus contrast
c(x,y,t) at each location and time by the
value of the receptive field W of the cell at that location
and at that time, and by algebraically summing the results:
|
(1)
|
This linear equation is at best an approximation. Possible
biophysical conditions that would lead to it being exact were suggested
in a previous study (Carandini and Heeger, 1994
), and are summarized in
Discussion.
In this study, the driving current Id (and thus
the receptive field W) will be estimated rather than
measured directly. Direct measurement of Id
would require intracellular in vivo voltage-clamp experiments.
RC circuit. We adopt an extremely simplified biophysical
model of a cell membrane: a circuit composed of a resistor and a capacitor arranged in parallel (RC circuit). According to
this model, the membrane potential V(t) obeys the following
equation:
|
(2)
|
where C is the membrane capacitance, g(t) is
the total membrane conductance, and
Id(t) is the driving current. In the
absence of visual stimuli the driving current is zero, and the membrane potential is driven to its resting value, which we have taken to be
zero.
Rectification. As a first approximation, the transformation
from the membrane potential V to the spike rate R
can be modeled by rectification (Movshon et al., 1978b
;
Jagadeesh et al., 1992
; Carandini et al., 1996
). Rectification is a
function that is zero for membrane potentials below a threshold,
Vthresh, and grows linearly: R(t)
max(0, V(t)
Vthresh).
This function is depicted for three different values of the threshold
Vthresh by the straight lines in
Figure 2A.
Fig. 2.
Interrelations and effects of the principal
variables in the normalization model. A, Relation between
membrane potential V and firing rate R. For
simplicity in this study the resting potential is taken to be
V = 0. The thick, intermediate, and
thin lines depict rectification with thresholds
Vthresh = 0, 6, and 12 mV, respectively. The
dashed curves indicate approximations to rectification obtained with power functions, with exponents n = 2 (thick dashes) and n = 3 (thin
dashes). B, Relation between pool activity and membrane
conductance. The abscissa plots the overall response of the
pool, k
R; the ordinate plots the increase in
membrane conductance g/g0
1 (Eq. 4).
C, Effects of conductance on the size and time course of the
membrane potential responses. The curves are the membrane potential
responses to a current step with onset at time zero, for three
different values of the conductance g. As the conductance
doubles (thin to thick curves), it reduces both
the gain and the time constant of the cell.
[View Larger Version of this Image (28K GIF file)]
Rectification is however not very easily handled in mathematical
derivations. We thus approximate rectification
(Vthresh > 0) with
half-rectification (Vthresh = 0)
followed by elevation to the power n:
|
(3)
|
The quality of this approximation is shown by the dashed
curves in Figure 2A. The value of the exponent
n grows with the distance of the threshold
Vthresh from the resting potential
Vrest. If the threshold is very close to rest,
then n
1 ("half-rectification"). If the
threshold is a bit above rest, e.g., 6 mV higher, then n
2 ("half-squaring"). If the threshold is far
above rest, then n
3 or more.
Conductance and cortical activity. We now make the central
assumption that cells belong to a normalization pool, the
members of which inhibit each other by increasing the conductance
g of each other. This form of inhibition is known as
shunting inhibition and unless all the neurons in the pool
are inhibitory would require the presence of inhibitory
interneurons.
The particular function that we choose to relate the conductance
g and the overall activity of the pool
R is
illustrated in Figure 2B. Its mathematical expression
is
|
(4)
|
where the parameter k determines the effectiveness of
the normalization pool. This function is completely ad hoc
and is not currently supported by physiological evidence. Our reasons
for choosing it are evident in Appendix, in which we derive closed form equations for the responses of the model.
The membrane conductance g affects both the size and the
time course of the responses. Figure 2C shows the responses
of the membrane to a current step for three values of the conductance g. If the conductance is very small, the response is slow,
and there is high gain (that is, the voltage response to a given
current is high). If the conductance g is very large (the
membrane is very leaky), it has small gain and is fast in charging and
discharging the capacitor.
The conductance of each cell is minimal in the absence of any visual
stimulus, because all of the cells in the normalization pool are
silent. The conductances are larger for a visual stimulus that is
effective in driving the cells in the pool. This decreases the gain and
the time constant of the cells in the pool so that they are more
responsive and better able to follow the fine temporal changes of the
stimulus.
The normalization pool. Our final assumption regards the
composition of the normalization pool. We assume that the cells in the
pool are tuned to all stimulus orientations and directions and to a
broad range of spatial and temporal frequencies.
Solution of the model. The variables in the model depend on
each other in a circular way: (1) the firing rate R of each
cell depends on its membrane potential V (Eq. 3; Fig.
2A); (2) the membrane potential V of each
cell depends on its driving current Id and on
its conductance g (Eq. 2); and (3) the conductance
g of each cell depends on
R, the total firing
rate of the cells in the normalization pool (Eq. 4; Fig.
2B). This arrangement results in negative feedback,
because increases in the overall response
R increase the
conductance g, which in turn reduces the overall response
R. This guarantees that the conductance g
remains finite (
R < 1/k in Eq. 4).
The model is a nonlinear neural network (Grossberg, 1988
) and is in
general quite complicated, because both the driving current and the
conductance vary over time. Nevertheless, the model was designed so
that for the visual stimuli used in this study
drifting sine gratings,
plaids, and noise
we can derive approximate closed form equations for
its responses. These equations, together with their derivation, are
detailed in Appendix.
RESULTS
We report here on 149 data sets obtained from a total of 54 cells
that were clearly identified as simple and were held long enough to be
tested with at least two blocks of one of the core experiments in our
protocol. In particular, we report on 51 grating matrix experiments
from 34 cells, 76 plaid experiments from 27 cells, and 22 noise-masking
experiments from 17 cells.
The cells in the sample exhibited a broad spectrum of tuning
properties. The orientation tuning of the cells ranged from 14° to
124° half-width, with one-third of the cells showing a tuning sharper
than 24° and one-third broader than 51°. The directional index of
the cells (DI; Reid et al., 1987
) ranged over the whole spectrum from 0 to 1. Direction selectivity was prominent (DI > 0.6) in about
one-third of the cells.
Responses to gratings
Figure 3A shows the
period histograms of the responses of a typical simple cell to drifting
sinusoidal gratings with four different stimulus contrasts. Consistent
with the linear model, the responses look like rectified sinusoids.
Fig. 3.
Responses to drifting sine gratings of different
contrasts. The curves are fits of the normalization model.
The fits were performed on a larger data set, which included the
responses to 72 different drifting gratings (8 contrasts, 3 orientations, and 3 temporal frequencies). A, Period
histograms of the responses to four different contrasts. Scale bar in
spikes per second. B, C, Response amplitude and phase as a
function of contrast, computed from the first harmonic of the spike
trains. D, Polar plot of the responses in B and
C. Every point in the plot corresponds to a
sinusoid with an amplitude that is given by the distance from the
origin, and the phase of which is given by the angle with the
horizontal axis. As the contrast increases the responses get larger
(far from the origin), and their phases advance (they turn
counterclockwise). Asterisks indicate the predictions of the
normalization model at the different stimulus contrasts.
Circles have radius 1 SEM (N = 3) computed
from the estimated variance. Error bars in B and
C are ±1 SEM, computed from circles in
D. Cell 392l008 [directional index (DI) = 0.1; preferred
spatial frequency (SF) = 0.9 cycles/°, stimulus size (SZ) = 4.5°],
experiment 4. Parameters:
0 = 37 msec;
1 = 9 msec; n = 1.34.
[View Larger Version of this Image (23K GIF file)]
Dependence on contrast
There are subtle aspects of the responses that are not consistent
with a strictly linear model. One is response saturation (Maffei and Fiorentini, 1973
; Dean, 1981a
; Albrecht and Hamilton, 1982
;
Ohzawa et al., 1982
; Li and Creutzfeldt, 1984
; Sclar et al., 1990
;
Bonds, 1991
; Carandini and Heeger, 1994
). For a linear neuron, scaling
stimulus contrast by a certain amount would scale the responses by the
same amount. The responses of the cell in Figure 3, instead, increase
only marginally as the contrast doubles from 0.5 to 1. Another
nonlinearity is reflected in the latency of the responses. For a linear
cell response latency would be unaffected by stimulus contrast. Simple
cells, instead, display phase advance (Dean and Tolhurst,
1986
; Carandini and Heeger, 1994
; Albrecht, 1995
); i.e., they respond
sooner to high-contrast stimuli than to low-contrast stimuli. For
example, the cell in Figure 3 responds ~20 msec sooner to the
stimulus with unit contrast than to the stimulus with 0.12 contrast.
These effects on response size and latency are reflected in the
amplitude and phase of the first harmonic of the responses (Fig.
3B,C). For contrasts <0.2 the amplitudes (Fig.
3B) grow roughly linearly with contrast (the slope in double
logarithmic coordinates is close to 1), and the phases (Fig.
3C) stay substantially constant. As the contrast increases,
the amplitudes saturate and the phases advance.
Figure 3D replots the data in the polar plane where response
amplitude is represented as distance from the origin, and response phase is represented as the angle with the horizontal axis. As the
contrast increases the data points get farther from the origin (response amplitude increases), and they turn counterclockwise (response phase advances).
The predictions of the normalization model are characterized by two
equations, one for response amplitude and one for response phase. The
best fit model parameters were determined by simultaneously fitting
both the amplitude and phase of the responses. The model captures the
saturation in response amplitude (Fig. 3B) because it
postulates that increasing contrast increases the activity of the
normalization pool, which increases the membrane conductance, and thus
decreases the gain of the membrane. The model captures the advance in
response phase, because the increase in membrane conductance decreases
the time constant, so at high contrasts the membrane introduces shorter
delays than at low contrasts. The fits provided by the normalization
model are substantially more accurate than those provided by the linear
model; according to the linear model the data in Figure 3B
should lie on a diagonal line (no amplitude saturation), and the data
in Figure 3C should lie on a horizontal line (no phase
advance).
The equations for response amplitude and phase predicted by the model
are derived in Appendix. We present here the equation for response
amplitude, because it helps further illustrate the behavior of the
model. According to the model, the amplitude of the responses
R of a simple cell to a grating of contrast c and temporal frequency f is:
|
(5)
|
where the quantities L,
(f), and
n are determined, respectively, by the linear,
normalization, and rectification stages of the model (Fig.
1B). L is the response of the linear
receptive field of the cell to the grating at unit contrast (Eq. 1).
The normalization stage divides this quantity by
, where
(f) grows with the temporal frequency
f of the stimuli. Finally, n is the exponent of
the rectification stage (Eq. 3; Fig. 2A).
The dependence of response amplitude on stimulus contrast is quite
simple; at low contrasts, c
(f), the
denominator is approximately constant, and the responses grow as
cn. At high contrasts, instead, the
c in the denominator has a strong effect, and the responses
saturate. Equation 5 is similar to a hyperbolic ratio, which was
empirically found to provide good fits to the amplitude of the contrast
responses of V1 cells (Albrecht and Hamilton, 1982
; Sclar et al.,
1990
). Indeed, our ad hoc choice of the dependence of
conductance on the activity of the normalization pool (Eq. 4) was made
with this expression in mind.
Different orientations
Figure 4 shows the contrast
responses of a simple cell to two drifting gratings differing in their
orientation. As shown in Figure 4A, the responses
elicited by the grating drifting at
15° (left column)
were ~40% larger than those elicited by the grating drifting at
45°. This proportion remained substantially constant in the face of
prominent saturation above a contrast of 0.25.
Fig. 4.
Responses to drifting sine gratings at two
different orientations,
15° (gray) and
45°
(white). Fits of the normalization model (curves)
were performed on a larger data set than shown, which included 72 stimulus conditions (8 contrasts, 3 orientations, and 3 temporal
frequencies). A, Period histograms. Rows
correspond to different contrasts, columns to different
orientations. B, C, Response amplitude and phase as a
function of contrast. To facilitate comparison in C the
responses to each grating were shifted vertically so that the values
predicted by the model would overlap. D, Polar plot of the
responses in B and C. Cell 392l009 (DI = 0.5; SF = 0.4; SZ = 2.2), experiment 8; N = 3. Parameters:
0 = 28 msec;
1 = 3 msec;
n = 1.6.
[View Larger Version of this Image (24K GIF file)]
This property can be observed more precisely in Fig.
4B. The contrast responses obtained at the two
different orientations are vertical shifts of each other on a
logarithmic response scale, implying that the ratio of the responses to
different orientations was constant, irrespective of the stimulus
contrast. Another way to describe this behavior is to say that the
orientation tuning scaled with contrast, a property that has been
repeatedly observed for both orientation tuning and spatial frequency
tuning (Movshon et al., 1978c
; Albrecht and Hamilton, 1982
; Sclar and
Freeman, 1982
; Li and Creutzfeldt, 1984
; Skottun et al., 1987
).
As with response saturation, phase advance was controlled by the
contrast of the stimulus per se, rather than by the firing rate of the
cell. Even though the absolute phases of the responses to the two
gratings differed by about 180° (Fig. 4D) the
relative timing of the responses (difference in response phase) was
independent of stimulus contrast. This is illustrated in Fig.
4C, where the phases of the responses to each grating were
shifted vertically so that the fits provided by the normalization model
would overlap.
The curves predicted by the normalization model provided good fits to
the data in Figure 4. Because saturation and phase advance depend on
the stimulus contrast, and not on the size of the responses elicited in
a cell, their presence is not simply the result of nonlinearities in
the spike-encoding mechanism or in other attributes of a single cell.
Rather, their presence indicates the existence of a contrast gain
control mechanism in the visual cortex such as that described by the
normalization model.
In fact, the model mandates the orientation invariances in the contrast
responses, both in amplitude and in phase. In the expression for the
response amplitude (Eq. 5), stimulus contrast and stimulus orientation
are separable. The expression can be seen as the product of
two factors, [amplitude(L)]n and
(c/
)n. The first
factor depends on L, the response of the linear receptive field of the cell to the grating at unit contrast, so it depends on
orientation but not on contrast. The second factor depends only on the
contrast c and on the temporal frequency f of the grating. For a fixed temporal frequency the shape of the contrast responses is entirely controlled by this second factor, which is
independent of stimulus orientation. A similar argument can be made for
the phase responses predicted by the model: the expression for response
phase (Appendix, Eq. 13) is the sum of two terms, one that depends on
stimulus orientation but not on contrast, and one that depends on
stimulus contrast but not on orientation.
Different spatial frequencies
Changing the spatial frequency of a grating had the same effect on
the contrast responses as changing orientation; response amplitude was
shifted vertically on a logarithmic scale, and response phase was
shifted vertically on a linear scale. Figure
5 shows an example in which the responses
elicited by the 1.4 cycles/degree grating (Fig. 5A, left
column) were ~70% larger than those elicited by the 1.1 cycles/degree grating (right column). This proportion held
substantially constant in the face of response saturation. The fits of
the normalization model (continuous curves) capture all
these properties of the responses. Indeed, the very same argument about
separability in the model responses of contrast and orientation can be
made for contrast and spatial frequency.
Fig. 5.
Contrast responses for gratings with two different
spatial frequencies: 1.4 (gray) and 1.1 (white) cycles/degree. Fits of the normalization model
(curves) were performed on a larger data set than shown,
which included 40 stimulus conditions (10 contrasts, 2 spatial
frequencies, and 2 temporal frequencies). Contrasts <0.12 elicited <1
spike/sec. A, Period histograms. Rows correspond to different contrasts, columns to different spatial
frequencies. B, C, Response amplitude and phase as a
function of contrast. Responses to each grating in C were
shifted vertically so that their values predicted by the model would
overlap. D, Polar plot of the responses in B and
C. Cell 382l019 (DI = 0.8; SF = 1.4; SZ = 1.9), experiment 5; N = 6. Parameters:
0 = 18 msec;
1 = 8 msec; n = 4.
[View Larger Version of this Image (21K GIF file)]
Different temporal frequencies
Changes in the stimulus temporal frequency had very different
effects from changes in orientation or spatial frequency. In particular
the above-mentioned invariances of the contrast responses did not hold
for stimuli differing in temporal frequency. Rather, we found that
increasing the temporal frequency increased the contrast at which the
responses saturated and decreased the total phase advance. Similar
results (for the amplitude of the responses) were obtained in the cat
by Holub and Morton-Gibson (1981)
and in the monkey by Hawken and
collaborators (1992; also see Albrecht, 1995
, Appendix).
Figure 6 illustrates these phenomena. At
low temporal frequencies the responses saturated at low contrasts (Fig.
6A, left columns), but at high temporal frequencies
they did not show much saturation (right columns). This
behavior can be better observed in an amplitude plot (Fig.
6B); the contrast responses differ in their
horizontal position, so they could not be superimposed by a vertical
shift, as was the case with the contrast responses to different
orientations or spatial frequencies.
Fig. 6.
Dependence of the contrast responses on temporal
frequency. Curves are predictions of normalization model.
A, Period histograms. Rows correspond to
different contrasts, columns to different temporal frequencies. B, Response amplitude as a function of
contrast. The 3.3 Hz data were very close to the 1.6 Hz data and were
omitted to avoid clutter. C, Response phase as a function of
contrast. Gray levels indicate the temporal frequency as in
A. D, Response amplitude as a function of
temporal frequency and contrast. Dashed lines connect actual
data (dots); continuous lines indicate fits of
the model. Fits were performed on a larger data set than shown, which
included 64 stimulus conditions (8 contrasts, 4 temporal frequencies,
and 2 orientations). Cell 382l021 (DI = 0.1; SF = 1.4;
SZ = 7.5), experiment 5; N = 3. Parameters:
0 = 66 msec;
1 = 8 msec;
n = 4.
[View Larger Version of this Image (39K GIF file)]
The effect of temporal frequency on the contrast responses can be
rephrased in terms of the effect of contrast on the temporal frequency
tuning. Increasing stimulus contrast increased the responsivity of the
cells to the high temporal frequencies. This phenomenon is most visible
in Figure 6D, which can be seen as a set of temporal frequency curves measured at different contrasts. Although at low
contrasts the cell was essentially low-pass, at high contrasts the cell
was mildly bandpass, with the 6.5 Hz stimulus eliciting 46% stronger
responses than the 1.6 Hz stimulus. From the quality of the fits it is
clear that the normalization model captures this behavior. The linear
model, on the other hand, predicts that increasing the contrast should
just scale the responses, with no effect on the temporal frequency
tuning.
The effect of contrast on the temporal frequency tuning of the
normalization model can be understood by observing the effects of
changing the conductance on the temporal frequency tuning of an RC
circuit (Fig. 7). Increases in
conductance reduce the gain of the membrane more at low frequencies
than at high frequencies, substantially increasing the cutoff frequency
of the membrane. Because the conductance grows with stimulus contrast,
at low contrasts the cutoff frequency of the membrane is low, and the
low-pass character of the membrane dominates the responses. At higher
contrasts the cut-off frequency of the membrane is higher, and the
tuning of the responses is determined by the linear receptive field
providing input to the membrane. In the case of the cell in Fig. 6, the fits of the model indicate that the tuning of the linear receptive field was bandpass.
Fig. 7.
Effects of changing the conductance
g = 1/R in an RC circuit. Circuit parameters, and their
dependence on contrast, are estimated from the experiment in Figure 6.
Continuous curves show the transfer function at rest (low
conductance); dashed curves show the transfer function at
unit contrast (high conductance). Arrows indicate decrease
in gain (top) and phase advance (bottom) at four
temporal frequencies (1.6, 3.3, 6.5, and 13 Hz).
[View Larger Version of this Image (14K GIF file)]
Figure 7 also illustrates an example of how phase advances in an RC
circuit with increased conductance. The vertical arrows in
the bottom panel of Figure 7 indicate the total phase
advance predicted by the model at the four temporal frequencies tested in the experiment of Figure 6. The best fit model parameters predict that phase advance between zero and unit contrast is largest for the
6.5 Hz stimulus (51.9°), marginally smaller for the 3.3 and 13 Hz
stimuli (44.4° and 46.9°), and smaller still for the 1.6 Hz
stimulus (29.5°). The expression for the total phase advance predicted by the model is:
|
(6)
|
where f is the stimulus temporal frequency, and
0 and
1 are, respectively, the time
constant of the membrane at 0 and at unit contrast. The maximal phase
advance is achieved at a frequency equal to
1/(2
).
The data in Figure 8 exemplify the
dependence of phase advance on temporal frequency. For this cell the
best fit model parameters predict that the phase advance should be
minimal (11.3°) at 1.6 Hz and increase with temporal frequency:
20.77° at 3.3 Hz, 31.8° at 6.5 Hz, and 35.7° at 13 Hz. The data
clearly confirm this trend, which was typical of our sample. Indeed,
most of the figures in this study display data acquired with temporal
frequencies of ~6 Hz. We wanted to provide examples of contrast
responses showing clear saturation and clear phase advance. As
predicted by the model, we found that temporal frequencies <3 Hz
yielded strong saturation but little phase advance, whereas temporal
frequencies much >6 Hz showed large phase advances but little
saturation.
Fig. 8.
Phase advance and temporal frequency.
Curves are predictions of normalization model. A,
Period histograms. Rows correspond to different contrasts,
columns to different temporal frequencies. B,
Response phase as a function of contrast. Gray levels
indicate the temporal frequency as in A. Fits were performed
on a larger data set than shown, which included 60 stimulus conditions
(5 contrasts, 4 temporal frequencies, and 3 spatial frequencies). Cell
392l008 (same as Fig. 3), experiment 7; N = 3. Parameters:
0 = 27 msec;
1 = 7 msec;
n = 1.2.
[View Larger Version of this Image (16K GIF file)]
The increase in phase advance with increasing temporal frequency can
also be seen as a decrease in integration time, the slope of
a line fitted to a phase versus temporal frequency plot of the data. A
similar phenomenon
together with dramatic changes in the temporal
frequency tuning of the cells
was observed in cat by Reid et al.
(1992)
using broad-band high-energy stimuli. The authors of that study
pointed out that these behaviors could be explained by changes in the
membrane conductance of cortical cells. The normalization mechanism
that we propose works exactly that way, and indeed we have shown that
it predicts effects similar to those observed by Reid and collaborators
(Carandini and Heeger, 1993
).
An entire data set
The curves predicted by the model illustrated in the preceding
figures were the result of fits to entire data sets, not just to the
data appearing in the figures. For example, the responses in Figure 3
were obtained in a grating matrix experiment that included 72 different
drifting gratings, with eight different contrasts, three different
orientations, and three different temporal frequencies. The full set of
responses to these stimuli are shown in Figure
9. This example illustrates the principal
properties of the contrast responses; changing orientation shifts the
amplitude responses vertically on a logarithmic scale and the phase
responses vertically on a linear scale. Amplitude saturation is more
prominent at low temporal frequencies; phase advance is more prominent
at higher temporal frequencies.
Fig. 9.
An entire grating matrix data set. The cell was
tested with three different temporal frequencies (A, 3.3 Hz;
B, 6.6 Hz; C, 13 Hz), three different
orientations (white, 120°; gray, 80°; black, 40°), and nine different contrasts. Some period
histograms for these responses are shown in Figure 3A. The
shapes of the 18 curves are determined by only 3 parameters:
0 = 37 msec;
1 = 9 msec;
n = 1.34. Eighteen additional parameters determine the vertical positions of the eighteen curves. Cell 392l008, experiment 4;
N = 3.
[View Larger Version of this Image (31K GIF file)]
The 18 curves predicted by the normalization model (9 for amplitude and
9 for phase) provide satisfactory fits to the data. Whereas the
vertical position of each curve depends on the linear stage of the
model, the shape of all the curves (including their horizontal
position) depends on the normalization and rectification stages. In
particular, the vertical position of each curve is determined by one
parameter, corresponding to the amplitude or phase of the response of
the linear stage to each grating at full contrast. The shape and
horizontal position of all the curves, instead, are determined by a
total of three parameters. The first two are the time constants
0 and
1 of the membrane at rest and at
full contrast; these characterize the normalization stage and [by
determining
(f)] control the horizontal
position of the amplitude curves and the steepness of the phase curves.
The third parameter is the exponent n, which characterizes
the rectification stage. It controls the steepness of the amplitude
curves below saturation, and has no effect on the phase curves.
Responses to plaids
We now consider the responses to a wider set of visual stimuli:
plaids composed of two drifting gratings having the same temporal frequency. The gratings differed in orientation and/or in spatial frequency, and their contrasts c1 and
c2 assumed a variety of different values.
Cells in the cat primary visual cortex display a phenomenon known as
"cross-orientation inhibition" (Morrone et al., 1982
; Bonds, 1989
;
Gizzi et al., 1990
), in which the responses to optimal stimuli are
inhibited by the presence of stimuli of nonoptimal orientation, which
would elicit negligible responses if presented alone. More generally,
there are numerous reports of conditions in which cells in the cat
visual cortex are inhibited by stimuli that elicit no response when
presented alone. This inhibition has been found to be independent of
direction of motion, largely independent of orientation, and broadly
tuned for spatial and temporal frequency (Bishop et al., 1973
; Dean et
al., 1980
; Burr et al., 1981
; Hammond and MacKay, 1981
; Morrone et al.,
1982
; De Valois and Tootell, 1983
; Kaji and Kawabata, 1985
; Gulyas et al., 1987
; Bonds, 1989
; Nelson, 1991
; DeAngelis et al., 1992
; Geisler
and Albrecht, 1992
). Cross-orientation inhibition can be elicited with
one grating in each eye, although suppression with both gratings in the
same eye is typically stronger (Ferster, 1981
; Ohzawa and Freeman,
1986a
,b
; Freeman et al., 1987
; DeAngelis et al., 1992
; Sengpiel and
Blakemore, 1994
; Sengpiel et al., 1995
; Walker et al., 1996
).
Our results indicate that cross-orientation inhibition is present in
most cells of the monkey primary visual cortex. An example of this is
shown in Figure 10, which shows the
responses of a simple cell to a plaid with components that drifted in
orthogonal directions. Although one of the gratings (grating 1) was
quite effective in driving the cell (Fig. 10A, left
column), the other (grating 2) elicited almost no spikes when
presented alone (top row). Its presence, however, clearly
suppressed the responses to the first grating. The inhibitory effect of
the second grating can be observed more precisely in Figure
10B, which shows the contrast responses of the cell
for four different contrasts of grating 2. As observed by Bonds (1989)
in the cat, the presence of the second grating shifts the contrast
response to the right on a logarithmic scale. This shift to the right
would not be explained by the linear model; if cross-orientation
inhibition were attributable to a linear interaction between two
(possibly subthreshold) linear responses, it would subtract from the
responses a fixed quantity. The responses to the first grating would
saturate at the same contrast, irrespective of the contrast of the
second grating. As shown in Figure 10, this is not the case.
Fig. 10.
Masking by an orthogonal grating. Responses to a
plaid experiment in which one component was nearly optimally oriented
(grating 1), and the other was orthogonal and ineffective in driving
the cell when presented alone (grating 2). Curves are fits
of the normalization model. A, Period histograms for
different contrasts of the components. Rows, Different
contrasts of grating 1 (c1). Columns, Different contrasts of grating 2 (c2). As c2 was
increased, the responses decreased in size (cross-orientation
inhibition). B, Response amplitude as a function of
c1, for different values of
c2 (white to black: 0.06, 0.12, 0.25, and 0.5). As c2 increased, the
contrast responses shifted to the right; more and more contrast of
grating 1 was needed to maintain a set level of firing. C, Same data, plotted as a function of c2, for
different values of c1 (white to
black: 0, 0.06, 0.25, and 0.5). Cell 392l024 (DI = 0.4;
SF = 0.1; SZ = 6.8), experiment 9; N = 3. Parameters:
0 = 158 msec;
1 = 5 msec;
n = 2.3.
[View Larger Version of this Image (20K GIF file)]
The shift to the right of the contrast responses corresponds to an
effective scaling of stimulus contrast. This is the behavior predicted
by the normalization model (Heeger, 1992b
), which, as illustrated by
the curves in Figure 10, provided good fits to our plaid
data. Approximate equations for the amplitude and phase of the
responses of the model to plaids are derived in Appendix. The
expression for response amplitude is:
|
(7)
|
where c1 and c2 are
the contrasts of the two gratings, L1(t) and
L2(t) are the responses of the linear receptive
field to the individual gratings at unit contrast, and the remaining
symbols have the same meaning as in the expression for the response to individual gratings (Eq. 5). Since the receptive field of the cell is
linear, its response to the plaid is just a linear combination of its
responses to the individual gratings,
c1L1(t) + c2L2(t). The normalization stage divides
that by approximately
(see Appendix). If, as in
Figure 10, grating 2 alone does not elicit any response
(L2
0), then the effect of an increase of
c2 in the denominator is to shift the contrast
response to the right on the log contrast axis (Heeger, 1992b
).
The pure rightward shift of the contrast responses occurs only when the
cell is completely unresponsive to the masking grating. When each
grating in the plaid elicits (even minimal) responses when presented
alone, their combined effect is more complicated. In this case the
sinusoidal responses of the linear receptive field to the individual
gratings are added together before the normalization stage. Depending
on their relative phase they can add constructively or destructively.
An example of this is shown in Figure
11. The top and bottom
rows in Figure 11A show the period histograms of
the responses of a cell to two gratings of different spatial frequency.
Both gratings elicited strong responses, with phases differing by
approximately 90°. The responses to the "plaids" obtained by
summing the gratings are shown in the middle row.
Fig. 11.
Responses to the sum of two equally effective
stimuli. Components differed in spatial frequency. Continuous
curves are fits of the normalization model. A, Period
histograms for four different contrasts. Dark gray,
Responses to grating 1 (1.2 cycles/degree). White, Responses
to grating 2 (0.6 cycles/degree). Light gray, Responses to
the sums of the stimuli in the top and bottom
rows. B, Polar plot of the first harmonic responses.
Squares indicate the vectorial sums of the responses to the
individual gratings (dark gray and white data
points). The actual responses to the plaid were smaller (closer to
the origin) and occurred slightly sooner (counterclockwise) than this
linear prediction. Cell 385r037 (DI = 0.9; SF = 0.8; SZ = 1.7), experiment 05; N = 4. Parameters:
0 = 45 msec;
1 = 2 msec;
n = 2.44.
[View Larger Version of this Image (15K GIF file)]
The sum of sinusoids is best understood in a polar plot (Fig.
11B), in which every sinusoid corresponds to a
vector, and the sum of sinusoids is just a sum of vectors. The
dark gray data points are the responses to grating 1; the
white data points are the responses to grating 2. The
light gray data points are the responses to the plaid
obtained by superimposing the two gratings. The squares
indicate the linear predictions for the plaid responses obtained by
summing (vectorially) the responses to the individual gratings. The
actual plaid responses show more saturation (they remain closer to the
origin) than these linear predictions. They also occur earlier (their
angle with the horizontal axis is larger) than the linear predictions.
Although not perfect, the fits of the normalization model (continuous
curves) capture both phenomena. This is because the local stimulus
energy of the plaid is greater than that of the individual gratings. In
the model this results in higher membrane conductance, which causes a
decrease in gain and time constant.
Figure 12 illustrates another
example of plaid responses. In this case two orthogonal gratings were
able to drive the cell. Grating 2 was not as effective as grating 1, but it did elicit some spikes when presented alone. The dependence of
the responses on the contrasts of the gratings is complicated:
depending on the contrast of grating 1, increasing the contrast of
grating 2 either enhanced or suppressed the responses. This behavior
would be hard to explain at the level of a single cell. Instead, as shown by the continuous curves fit to the responses, it is precisely predicted by the normalization model. The contrasts of the two gratings, c1 and c2,
appear both in the numerator and in the denominator of Equation 7.
Increasing one of the two can result either in an enhancement or in a
reduction in the response, depending on the amplitudes and phases of
the underlying linear responses L1 and
L2.
Fig. 12.
Masking with a grating that is effective in
driving the cell. Responses to a plaid experiment in which one
component was nearly optimally oriented (grating 1), and the
other was orthogonal but still elicited some response when presented
alone (grating 2). A, Period histograms for
different contrasts of the components. Rows, Different
contrasts of grating 1 (c1).
Columns, Different contrasts of grating 2 (c2). When presented alone, grating 1 elicited strong responses (left column), grating 2 weak
responses (top row). B, Response amplitude as a
function of c2, for different values of
c1 (white to black: 0, 0.06, 0.12, and 0.5). Increasing c2 increased
the size of the responses when grating 1 was absent; it inhibited the
responses for intermediate contrasts of grating 1, and it had little
effect for high contrasts of grating 1. Cell 392r013 (DI = 0.9;
SF = 0.4; SZ = 3.4), experiment 12; N = 3. Parameters:
0 = 136 msec;
1 = 1.4 msec;
n = 2.22.
[View Larger Version of this Image (17K GIF file)]
Figure 13 illustrates the responses of
the same cell to different plaids. The top panel in Figure
13A replots the amplitude data of Figure 12, and the
bottom panel shows the corresponding phase data,
illustrating that increasing the contrast of either grating resulted in
phase advance. In Figure 13A grating 2 drifted at 90° with
respect to grating 1, and it elicited responses that were smaller by
about a factor of five. When grating 2 was replaced by one drifting at
30° with respect to grating 1, it elicited responses that were only
marginally smaller than those to grating 1 (Fig. 13B, top
panel). The phases of the responses to the two individual
gratings were almost opposite (Fig. 13B, bottom
panel), ~0° for grating 1 and ~135° for grating 2. As a result the two stimuli interacted destructively, as witnessed by
the dip in the diagonal region of the top panel in Figure
13B. In that region increasing the contrast of any of the
two gratings reduced the amplitude of the responses. The model clearly
captures this phenomenon, which is principally attributable to its
linear stage. When the spatial phase of grating 2 was changed by 90°
(Fig. 13C), this phenomenon disappeared. Now increasing the
contrast of either grating increased the size of the responses.
Fig. 13.
Amplitude (top) and phase
(bottom) of the responses of a cell to