Previous Article | Next Article 
The Journal of Neuroscience, April 1, 2002, 22(7):2904-2915
Natural Stimulation of the Nonclassical Receptive Field Increases
Information Transmission Efficiency in V1
William E.
Vinje1, 2 and
Jack L.
Gallant1, 3
1 Neuroscience Program and Departments of
2 Molecular and Cellular Biology and
3 Psychology, University of California at Berkeley,
Berkeley, California 94720-1650
 |
ABSTRACT |
We have investigated how the nonclassical receptive field (nCRF)
affects information transmission by V1 neurons during simulated natural
vision in awake, behaving macaques. Stimuli were centered over the
classical receptive field (CRF) and stimulus size was varied from one
to four times the diameter of the CRF. Stimulus movies reproduced the
spatial and temporal stimulus dynamics of natural vision while
maintaining constant CRF stimulation across all sizes. In individual
neurons, stimulation of the nCRF significantly increases the
information rate, the information per spike, and the efficiency of
information transmission. Furthermore, the population averages of these
quantities also increase significantly with nCRF stimulation. These
data demonstrate that the nCRF increases the sparseness of the stimulus
representation in V1, suggesting that the nCRF tunes V1 neurons to
match the highly informative components of the natural world.
Key words:
information theory; nonclassical receptive field; V1; sparse coding; efficiency; natural vision
 |
INTRODUCTION |
The classical receptive field (CRF)
of a visual neuron is traditionally defined as the region of space
where stimuli evoke action potentials. Surrounding the CRF is the
nonclassical receptive field (nCRF), where stimuli can modulate the
responses evoked by CRF stimulation (Allman et al.,
1985
). The nCRF may serve to mediate contrast gain control
through divisive modulation of the responses evoked by CRF stimulation
(Heeger, 1992
; Wilson and Humanski,
1993
). However, several experiments suggest that the nCRF may
also be critical for representing extended contours (Gilbert and
Wiesel, 1990
; Fitzpatrick, 2000
), corners
(Sillito et al., 1995
), or local curvature
(Wilson and Richards, 1992
; Krieger and Zetzsche,
1996
), and may aid in figure-ground segmentation (Knierim and Van Essen, 1992
). Together, these results
demonstrate that the nCRF plays an important role in the functioning of
V1 neurons.
In a previous study, we showed that natural stimulation of the nCRF
increases the selectivity of V1 neurons and decorrelates their
responses (Vinje and Gallant, 2000
). Those results
suggested that nCRF stimulation increases the sparseness of stimulus
representation in V1. Sparseness refers to the coding density of a
neural representation. In a maximally dense representation, every
neuron responds to every stimulus and information is fully distributed
across the population. In a maximally sparse representation, each
neuron responds to a single stimulus and acts as a "grandmother
cell." Extremely dense and extremely sparse codes are biologically
implausible; any real neural code will fall somewhere between these two extremes.
In a sparse representation, neurons are narrowly tuned and relatively
few are active at any moment. A central tenet of sparse coding is that
information should be translated without loss into an efficient
representation where the responses of a few active neurons are rich in
information content. Reducing the number of active neurons is
metabolically economical, thus easing a major constraint on information
processing in the brain (Laughlin et al., 1998
;
Sibson et al., 1998
). In addition, the relatively large information content per neuron potentially influences many aspects of
brain function, including pattern recognition capability and memory
capacity (Barlow, 1961
;
2001
).
The optimal level of sparseness is a function of the goals of the
system and the resources available. Recent theoretical work suggests
that natural images can be efficiently represented by a sparse code
(Srinivasan et al., 1982
; Barlow, 1989
;
Field, 1993
; Bell and Sejnowski, 1997
;
Olshausen and Field, 1997
, 2000
; Simoncelli and Olshausen, 2001
).
Field (1987)
demonstrated that linear filters can
produce a highly kurtotic, sparse output distribution in response to
natural images. However, some nCRF functions might only be realizable
with nonlinear mechanisms [e.g., biologically plausible curvature/corner detectors must be substantially nonlinear
(Zetzche and Barth, 1990
; Krieger and Zetzsche,
1996
)]. Therefore, nonlinear operations such as those
implemented by the nCRF are likely to play an important role in
increasing the sparseness of neural coding (Olshausen and Field,
1997
).
The hypothesis that nCRF stimulation increases the sparseness of
individual V1 neurons leads to numerous predictions. Several of these
predictions were confirmed in a previous report (Vinje and
Gallant, 2000
). As sparseness increases, individual neurons become more selective in their responses to complex stimuli, the kurtosis of the firing rate distribution increases, and the responses of neuron pairs are decorrelated.
The hypothesis that nCRF stimulation increases sparseness also leads to
four additional predictions. First, the average response rate should
decrease as sparseness increases in order to reduce the metabolic
demands of visual processing. Second, the reduction in spiking activity
should not reduce the information carried by the population of V1
neurons. Third, the average information content per spike should
increase. Finally, as sparseness increases, individual neurons should
become more efficient at information transmission.
 |
MATERIALS AND METHODS |
Subjects and physiological procedures. All animal
procedures were approved by oversight committees at the University of
Washington (St. Louis, MO) and the University of California at Berkeley
and conformed to or exceeded all relevant National Institutes of Health and United States Department of Agriculture standards. Surgical procedures were conducted under appropriate anesthesia using standard sterile techniques (Connor et al., 1997
).
Extracellular, single-neuron recordings were made with epoxy-coated
tungsten electrodes (AM Systems, Everett, WA and FHC, Bowdoinham, ME) from two awake, behaving monkeys (Macaca
mulata). Signals were amplified, band-pass filtered, and isolated
with a hardware window discriminator. Spike triggers were monitored at
8 kHz. Only clearly isolated single units were included in the data set.
Chambers were located over putative V1 by means of external cranial
landmarks. To confirm that recordings were obtained from V1 neurons, we
compared measured receptive field sizes and electrophysiological response properties with those expected from the literature.
Receptive-field estimation. The boundaries of the CRF were
estimated using bars and gratings for which characteristics and placement were manually controlled. We estimated the size of the CRF as
the diameter of the circle that circumscribed the minimum response
field of the neuron. For most neurons, these manual estimates were
confirmed by reverse correlation analysis using a dynamic (72 Hz)
sequence of small white squares flashed randomly in and around the CRF.
Reliable CRF estimates were typically obtained from 100-300 sec of
data, representing 20-60 behavioral fixation trials. In most cases
there was excellent agreement between CRF profiles estimated using the
two methods. In those cases in which the methods disagreed, the reverse
correlation size estimates were used. CRF diameters ranged from ~20
to 50 min of arc, consistent with other studies (Snodderly and
Gur, 1995
).
Simulated eye-movement model. During natural vision,
primates make stereotyped eye movements consisting of relatively long, stable fixations interspersed with rapid saccades from one point to
another (Keating and Keating, 1982
; Burman
and Seagraves, 1994
). The temporal structure of natural visual
stimulation is strongly influenced by these underlying eye movements.
We simulated natural macaque eye movements using a statistical model.
Eye-movement distributions were acquired during free-viewing
experiments using a scleral search coil. These data were used to model
the distribution of saccade lengths and the velocity profiles
appropriate for each saccade. For each simulated eye-movement sequence,
fixation durations were chosen randomly from a Gaussian distribution
with a mean of 350 msec and an SD of 50 msec. Saccade directions were
chosen randomly from a uniform distribution of angles.
Natural-vision movies. Natural-vision movies were
constructed by extracting image patches from natural scenes along the
simulated eye-scan path. Scenes were chosen from a commercial,
high-resolution photo-CD image library of landscapes, structures,
people, and animals (Corel Corp., Ottawa, Ontario, Canada) and were
converted to grayscale before display. Image patches were extracted
along a simulated scan path that was sampled at ~1 kHz. Each 13.8 msec (72 Hz) movie frame was constructed by averaging 14 separate image patches. Individual frames were then concatenated to form movies. This
over-sampling followed by averaging minimized the potential of
introducing temporal aliasing artifacts into the movie.
Patches of one, two, three, or four times the diameter of the CRF were
used to create a set of natural-vision movies. Movies of different
sizes were not scaled versions of one another. Instead, the patch
boundary was changed to reveal more or less of the underlying natural
scene. Thus, the region of the natural vision movie covering the CRF
was identical across all movie sizes, and any response modulation
attributable to stimulus size should reflect the effects of nCRF
stimulation. Figure 1 illustrates the
stimulus generation method.

View larger version (164K):
[in this window]
[in a new window]
|
Figure 1.
Natural-vision movies reproduce the stimulation
that occurs during free viewing of natural scenes. To construct a
natural vision movie, a saccadic scan path (white line) is
generated using a model derived from previously recorded eye movements.
Image patches centered on the scan path coordinates (white
circles) are then extracted from the underlying image. Image
patches were from one to four times the size of the CRF. (The
small circle indicates 1 × CRF diameter, whereas the
large circle indicates 4 × CRF diameter.) Note that
although nCRF stimulation varied substantially with stimulus size, the
stimulus falling on the CRF was the same for all sizes.
|
|
Flashed natural-image patches. An additional stimulus set
was constructed by extracting image patches from along an eye-scan path
that was recorded during free viewing of natural scenes (Gallant et al., 1998
). Eye positions during fixations were identified using an automated procedure that registered a fixation whenever the
eye remained within a 0.3 CRF diameter window for at least 70 msec; a
change in fixation was registered when the eye moved >0.3 receptive
field diameters from its original location. These fixation locations
were used as center points for patch extraction, and patches of 1 × CRF and 3 × CRF diameter were extracted from the natural scene
in the manner described above. Each patch data set contained responses
from 10-25 such patches.
The image patches were presented in grayscale under behavioral
conditions similar to those used for natural vision movies (see below).
Patches were shown at either the same size as the estimated CRF or
three times larger than the CRF. Each behavioral trial included four
random patches flashed for 500 msec each and separated by 700 msec
interstimulus intervals.
Stimulus presentation. Stimulus presentation and behavioral
control were handled by an Indigo2 workstation
(SGI, Mountain View, CA) using custom software. Stimuli were
presented on a high-quality video monitor (Sony Trinitron; Sony, Tokyo,
Japan) at 1280 × 1024 pixel resolution. Movies were broken into 5 sec segments (trials) and were shown centered on the CRF center of the
recorded neuron. During movie display, the animal fixated on a small
target spot near the center of the monitor. Eye position was monitored
using a scleral search coil, and trials were aborted if the eye
deviated from fixation by >0.35°. At the end of each successful
trial, the animal earned a liquid reward. Only one stimulus size was
shown on each trial; stimuli of different sizes were randomly
interleaved across trials.
Response modulation ratio. The nCRF modulation produced by
stimuli of a given diameter is quantified in terms of the response modulation ratio:
|
(1)
|
In our analysis, the fundamental quantity of interest is the
average number of action potentials occurring in each time bin of the
natural-vision movie. In Equation 1,
r
is the average response
recorded during the ith time bin for stimuli confined to the
CRF, and
r
is the average
response recorded during the ith time bin for stimuli m times the diameter of the CRF. Responses are averaged
across repeated stimulus presentation trials.
Selectivity index for natural-vision movies. We define a
selectivity index based on the responses of a neuron across a stimulus set:
|
(2)
|
Here µ is the mean response of the cell,
is its SD, and
the number of time bins is given by n.
The terms in braces define the activity fraction of the neuron
across the stimulus set (Tovee et al., 1993
). It is easy
to anticipate the asymptotic behavior of the activity fraction
(consider the expanded form of the activity fraction in the middle
expression of Eq. 2). If a neuron were nonselective, then
r
would be constant across stimuli
and the numerator and denominator of the activity fraction would be
equal. In contrast, if a neuron responded to only the kth
stimulus then the numerator would be given by
(r
)2, whereas the
denominator would be larger by a factor of n,
n(r
)2. Thus, the activity
fraction ranges from 1, when the cell is nonselective, to
1/n, when the cell responds to a single stimulus frame.
Equation 2 rescales the activity fraction so that it conveniently
ranges from 0 to 1. S will be 0 if a neuron is completely nonselective and 1 if it responds only to a single stimulus. For convenience, we express S as a percentage. In a previous
publication, the selectivity index was referred to as the sparseness
index (Vinje and Gallant, 2000
). In this paper, we use
sparseness as an adjective describing how stimuli are represented by
sensory neurons; therefore, increasing sparseness should produce
numerous effects, including increasing selectivity.
Information transmission in sensory neurons. From the
perspective of information theory, an axon is a biological
communication channel. Consider an observer who is monitoring the axon
of a sensory neuron with known filtering properties. Before the neuron responds, the observer is uncertain about the nature of the stimulus. After observing the responses of the neuron, the observer can determine
the overlap between the stimulus and the neural filtering properties.
Thus, the response of the neuron reduces the observer's uncertainty
about the stimulus. The amount by which a response reduces uncertainty
is referred to as the mutual information carried between stimulus and response.
The total stimulus entropy, H(s), quantifies the observer's
uncertainty regarding the stimulus before the response of the neuron is
observed. The conditional stimulus entropy H(s|r), gives the stimulus uncertainty that remains after response observation. If
the response is reliably influenced by the stimulus, then the conditional stimulus entropy will necessarily be less than the total
stimulus entropy. The transmitted mutual information is given by
(Cover and Thomas, 1991
):
|
(3)
|
The stochastic nature of spike generation means that neural
responses are variable even when a stimulus is repeated exactly. The
response variability caused by noise (noise entropy) limits the amount
of information that can be transmitted about a stimulus. Figure
2A illustrates the
relationship between stimulus entropy, noise entropy, and mutual
information. The reduction in stimulus uncertainty is equal to
the reduction in uncertainty regarding the response:
|
(4)
|

View larger version (42K):
[in this window]
[in a new window]
|
Figure 2.
Sensory neurons transmit information when their
responses allow an observer to reduce uncertainty regarding the nature
of the stimulus. A, Diagram illustrating relationship
between uncertainty and information. The first rectangle
symbolizes the total uncertainty present in the set of all
stimulus-response pairings for a given neuron; the second
rectangle represents the observer's a priori
uncertainty about the stimuli in a natural-vision movie; the
third rectangle represents the uncertainty in the observed
responses of the neuron. These uncertainties can be translated into
entropies by means of Equation 5. The single number that summarizes
overall stimulus uncertainty is the total stimulus entropy,
H(s), while the total response entropy is H(r).
The remaining rectangles are the conditional stimulus
uncertainty (C.S.U.) and the conditional response
uncertainty (C.R.U.) (quantified by the entropies
H(s|r) and H(r|s), respectively). The
gray-shaded region denotes correlations between the stimulus
and the responses of the neuron; this correlation is what allows
information, I(s, r), to be transmitted. If every stimulus
evokes a unique and repeatable response, then response uncertainty will
be entirely determined by stimulus uncertainty. In this case the
gray-shaded region would completely overlap both stimulus and response
uncertainties. In real neurons, repeated presentation of a stimulus
produced a range of responses, so H(r) > I(s, r). The
remaining uncertainty, H(r|s), is attributable to noise in
the encoding and transmission process. B, Grayscale
rastergram of single neuron responses to repeated movie presentations.
Rows represent repeated presentations of the movie, whereas
columns represent individual time bins. Each time bin
contains a single response word whose identity is determined by the
number of action potentials (identity is indicated by the shading of
each bin). The total response entropy, H(r), is a function
of the frequency with which each word is observed,
p . C, Magnified
view of responses to one stimulus repeated 20 times. Variation in the
identity of the response words is clearly visible across trials and is
quantified as noise entropy, H(r|s = k). Noise
entropy is a function of the probability that each word occurs in
response to the kth stimulus,
p .
|
|
Here, H(r) is the total response entropy, which
quantifies the overall variability of the responses of a neuron across
the stimulus ensemble. H(r|s) is the conditional response
entropy, describing the average variability in responses evoked by a
single stimulus. The conditional response entropy is equivalent to the noise entropy. In practice it is often easier to evaluate response entropies than stimulus entropies.
Calculation of total response entropy and conditional response
entropy. It is straightforward to compute the total response entropy via the direct method (de Ruyter van Steveninck et al., 1997
; Borst and Theunissen, 1999
; Reich
et al., 2000
). All direct information estimation methods begin
by translating the spike train into discrete words that represent local
spike patterns. The choice of translation process is equivalent to
choosing a hypothesis about how neurons encode and decode information.
The detailed nature of the encoding/decoding process is still
unresolved for V1 neurons, but the most common assumption is that
neurons in V1 employ a memory-less rate code. Under this
assumption, information is carried by the number of spikes occurring in
each time bin. All bins are treated independently, so there is no
possibility that information is carried (or lost) by patterns in the
firing rate that extend across multiple bins. This rate-coding
assumption also ignores highly precise temporal patterns that may occur
within a single time bin (i.e., tight coupling of spike times to
external events or internal oscillations).
If the firing rate possesses temporal correlations extending across
multiple time bins, then the assumption of a memory-less rate code may
lead to overestimation of the information transmission rate of the
neuron. Conversely, if information is carried by the temporal structure
of the spiking activity within a time bin, then the memory-less rate
code assumption may lead to underestimation of the information
transmission rate. Clearly, the assumption of a memory-less rate code
has strengths and weaknesses. Many more complex neural codes have been
proposed (for example, see Optican and Richmond, 1987
;
Richmond et al., 1987
; Meister, 1996
; de Ruyter van Steveninck et al., 1997
), but their
existence is controversial. More complex coding schemes are also more
difficult to assess experimentally; this is especially true for codes
involving extended temporal correlations. For these reasons, we have
restricted the current analysis to the hypothesis of memory-less rate coding.
After the spike train is translated into discrete words, the
probability of word occurrence is determined empirically from the data.
After determining the occurrence probability of each word, the entropy
can be found using (Shannon and Weaver, 1949
):
|
(5)
|
The summation runs over discrete words, and
pj is the probability of the occurrence of the
jth word.
Under the assumption of a memory-less rate code, the spike train is
divided into nonoverlapping time bins that are treated as independent
words. Each word is uniquely identified via the number of spikes that
it contains (Reich et al., 2000
). The total response
entropy is given by:
|
(6)
|
Where p
is the number of time
bins containing exactly j action potentials divided by the
total number of time bins. The total response entropy is a function of
both the number of distinct response words and their frequency of
occurrence. Total response entropy is therefore related to the dynamic
range of a neuron; neurons with larger dynamic ranges will be able to
generate a larger variety of spike patterns in response to a given
stimulus set.
The noise entropy describes the average variability of responses to
single stimuli. Let p
be the probability
that the jth word occurred in response to the kth
stimulus. The noise entropy for stimulus k is given by:
|
(7)
|
The probability of word occurrence for each stimulus,
p
, is equal to the number of stimulus-repetition trials on which the kth stimulus
produces j action potentials, divided by the overall number
of repetitions. (In the experiments reported in this paper, each
stimulus was repeated between 10 and 40 times.)
The overall noise entropy of the neuron is found by averaging across
the noise entropies of the individual stimuli:
|
(8)
|
Given H(r) and H(r|s), Equation 4
provides information transmission per time bin. Figure 2B, C
provides a graphical overview of how the response probabilities are
determined from the data. From Equation 4 it can be seen that the
fundamental quantity in the analysis is the information per time bin.
However, to allow comparison with other studies, it is desirable to
report information transmission rates per second or per spike.
Information per second is found by dividing the information per time
bin by the duration of each time bin. Information per spike is found by
dividing the information per second by the mean number of spikes per second.
In the current study, two experimental factors may result in
underestimation of information transmission rates. First, visual stimuli were presented while animals performed a simple visual fixation
task. During fixation the eye is not entirely steady; a small degree of
ocular drift and corrective microsaccadic eye movements are inevitable.
These small eye movements introduce variability in retinal stimulation
that in turn increases response variability. This artificially inflates
our estimates of H(r|s) and thereby decreases our
estimates of I(s, r). A second source of bias might arise
from the absence of top-down influences that could influence V1
responses during natural vision. For example, during natural vision the
ocular-motor system might provide V1 with an efference signal denoting
eye movements that could allow V1 to process information more
efficiently. This possibility is clearly speculative, but the possible
role of extraretinal influences is poorly understood in V1. Both
factors suggest that our experimental estimates of information
transmission should be interpreted as de facto lower bounds
on the true capabilities of V1 neurons. Fortunately, our analysis
centers on the changes in information transmission that result from
differential stimulation of the nCRF. These factors should be common
across nCRF stimulation conditions and have little or no effect on our results.
Choice of time-bin duration. Because our analysis assumes a
rate code, the duration of the time bins should match the true integration time of the target neurons. Unfortunately, this critical time constant is unknown. To compensate, we analyze the data using several different binning times (4.6, 13.8, 25, and 50 msec) that span
the range of plausible integration times (Bair, 1999
). A summary of the results obtained with other bin lengths is given in
Table 1 and also discussed in Results. To
facilitate comparison with previous work (Vinje and Gallant,
2000
), we focus on the results obtained with 13.8 msec time
bins. With the exception of Table 1, all figures and results come from
13.8 msec binning unless stated otherwise.
Correction for finite data bias in the response entropies.
The values for p
and
p
are estimated from the experimental
data, leading to uncertainty in H(r) and
H(r|s). The uncertainties in the entropy estimates contain
both random error (because of sampling) and systematic biases. Error
attributable to sampling is handled conventionally, by considering
whether results are statistically significant. The bias, however, can
be removed explicitly. In particular, the noise entropy is strongly
affected by limitations in the number of trial repetitions. In general,
this results in potential underestimation of the noise entropy (which
would produce an overestimation of information transmission).
For both H(r) and H(r|s), the relationship
between the true entropy and the experimental estimate of the entropy
is given by (Treves and Panzeri, 1995
; Strong et
al., 1998
):
|
(9)
|
where c
is an empirically determined
weighting coefficient for the
th correction term and N
denotes the number of times each stimulus was repeated. In our data,
the linear bias term dominates the sum in Equation 9. In light of this,
we consider only the first- and second-order correction terms:
|
(10)
|
All of the analyses presented in this report used the
bias-corrected entropies, Htrue.
To find Htrue, we divide the original data
set into several subsets, each containing N trials, and
evaluate H
for each subset.
Subsets contain, respectively, one-quarter, one-third, one-half, or all
of the original trials. Second, we fit Equation 10 to these data via
least-squares minimization. The value of
Htrue is the ordinate intercept of the
best-fit function. Figure 3 illustrates this process for an example neuron.

View larger version (14K):
[in this window]
[in a new window]
|
Figure 3.
Finite repetitions of each stimulus lead to
systematic biases in the experimentally measured entropies. Response
entropy (circles) and noise entropy (squares) are
plotted versus 1/N, where N is the number of
trials used in the calculation of
p and
p . Error bars were determined by the
Jackknife method (Efron and Tibshirani, 1993 ). Because
of the small size of the error bars, they are shown in white
atop the data points. Lines represent the least-squares fit
of Equation 10 to the response entropy data (solid line) and
noise entropy data (dashed line). The true entropy values
are the ordinate intercepts of the best-fit functions.
|
|
Testing for excessive finite data bias. The number of trials
required for accurate bias correction depends on time-bin duration and
the response properties of the neuron under study. If there are too few
repeated stimulus presentations, higher-order correction terms become
important and Equation 10 fails to sufficiently describe the finite
data bias. Fortunately excessive levels of bias contamination can be
detected by testing whether the experimentally estimated entropies
violate the Ma bound (Strong et al., 1998
).
The Ma bound is a lower bound on response entropy and can be estimated
for both the total response entropy, H(r), and the noise
entropy H(r|s). For words composed of single time bins, the general expression for the Ma bound is given by (Ma,
1981
):
|
(11)
|
HMa is useful because it is less
susceptible to finite data bias than Hexp.
The response entropy can sink below the Ma bound only if
Hexp is strongly contaminated by finite data bias (Strong et al., 1998
). Because finite data
bias affects experimental estimates of the noise entropy more strongly
than estimates of the total response entropy, the noise entropy is more
likely to violate the Ma bound.
We computed the Ma bounds on both total response entropy and noise
entropy to allow exclusion of any neurons with gross levels of bias
contamination. During responses to natural-vision movies, both response
entropies were greater than HMa for all neurons. This satisfies the Ma bound criterion and indicates that our entropy estimates are free from excessive finite data bias.
Significance testing. We determined whether the descriptive
statistics of two sample sets are significantly different via randomized, two-tailed t tests (Manly, 1991
).
In all cases randomization was performed to rule out the null
hypothesis that the two sets of observations come from the same
underlying population distribution. Thus, significance implies that the
value of the descriptive statistic for nCRF data is significantly
different from the corresponding value obtained with CRF data. The
standard significance criterion of p
0.05 is
sufficient when comparing two collections of neurons. However, when
judging significant differences in single neurons or time bins, we use
a more restrictive significance criterion, p
0.01.
 |
RESULTS |
Response modulation by the nCRF in area V1 during
natural vision
Many studies have demonstrated that the nCRF has pronounced,
generally suppressive effects on responses (Hubel and Wiesel, 1965
; Blakemore and Tobin, 1972
; Bishop
et al., 1973
; Nelson, 1991
). However, nCRF
modulation can also enhance responses (Jones, 1970
;
Hirsch and Gilbert, 1991
; Knierim and Van Essen,
1992
; Levitt and Lund, 1997
; Kapadia et
al., 2000
). We have found that nCRF stimulation during natural
vision can both enhance and suppress responses. Figure
4A shows the peristimulus time
histogram (PSTH) obtained from one V1 neuron in response to stimulation
by a natural-vision movie confined to the CRF. When the stimulus size
is increased to four times the diameter of the CRF (4 × CRF) some
responses are enhanced while others are suppressed (Fig.
4B). The modulation ratio, Ri,
summarizes the influence of the nCRF on the ith stimulus time bin (see Materials and Methods). In Figure 4, those time bins with
significant Ri values (p
0.01) are shown in white (significant suppression) or
black (significant enhancement). Modulation by the nCRF
depends on both image content and the elapsed time from fixation onset.
These intrafixation temporal dynamics may reflect presynaptic
depression (Abbott et al., 1997
; Chance et al.,
1998
), some form of short-term adaptation, or perhaps the
influences of intracortical feedback (Rao and Ballard,
1999
).

View larger version (26K):
[in this window]
[in a new window]
|
Figure 4.
The nCRF modulates responses
during natural vision. A, PSTH obtained from one V1 neuron
in response to a natural-vision movie confined to the CRF. Responses
are weakly modulated by the simulated fixations (information per
second, 13.1 bits/sec; information per spike, 0.18 bits/spike;
efficiency, 10%; selectivity index, 13%). B, Responses of
the same cell to a natural-vision movie composed of the CRF stimulation
used in A plus a circular surrounding region. The overall
stimulus size was 4 × CRF diameter. Stimulation of the nCRF
dramatically increases variation of responses across fixations
(information per second, 28.4 bits/sec; information per spike, 0.67 bits/spike; efficiency, 26%; selectivity index, 51%). Responses to
some stimuli are significantly enhanced (black bins;
p 0.01). For this neuron, enhancement is
concentrated in the onset transients occurring at the beginning of
simulated fixations. Other responses are strongly suppressed
(white bins; p 0.01). The
under-bar highlights those time bins where significant
enhancement and suppression occur.
|
|
To quantify the observed modulation, we calculated
Ri values for all time bins in our data set
(Fig. 5A-C). Again,
Ri values are colored according to their
significance: white for significant suppression,
black for significant enhancement. (Modulation ratios and
histogram values are plotted on logarithmic scales because of the large
dynamic range of modulation produced by natural stimulation of the
nCRF.) As stimulus size increases, there is a modest increase in the
number of significantly modulated time bins. In general, enhancement is
always less pronounced than suppression. However, for all stimulus
sizes a substantial fraction of modulation is positive. As stimulus
size increases, the net modulation becomes steadily more
suppressive.

View larger version (32K):
[in this window]
[in a new window]
|
Figure 5.
Stimulus size affects the modulation ratio and the
mean spike rate. A-C, Modulation ratios,
Ri, for stimuli of sizes 2 × CRF,
3 × CRF, and 4 × CRF. Both axes are presented on a
logarithmic scale (base 10) to facilitate display. Positive values of
log[Ri] indicate relative enhancement, whereas
negative values of log[Ri] indicate relative
suppression. Those time bins with significantly enhanced ratios are
shown in black, whereas those with significantly decreased
ratios are shown in white (p 0.01). The total
number of valid time bins in each data set,
nt; the number of time bins demonstrating
significant enhancement, n ; and the number of
time bins demonstrating significant suppression,
n , are indicated to the right of
each histogram. Not all time bins are included in the histogram. Time
bins for which Ri = 0 are indicated by
no. These bins are included in the total
number of valid bins, but have undefined logarithms. Time bins where
CRF-sized stimuli evoke no spikes are indicated by
ne. These bins have undefined modulation ratios
and are not included in the total number of valid time bins.
Enhancement is less frequent than suppression, especially at larger
stimulus sizes. Significant enhancement occurs in 1.7, 2.3, and 3.0%
of the time bins for stimulus sizes of 2 × CRF, 3 × CRF and
4 × CRF, respectively, whereas significant suppression occurs in
4.6, 6.4, and 8.0%. D, Average spikes per second versus
stimulus size. The mean spike rate falls monotonically as stimulus size
increases. The mean spike rate in response to 4 × CRF stimuli is
~75% of mean the rate observed with CRF stimulation alone.
|
|
Suppression also significantly decreases the mean spiking rate of
individual neurons. The fractions of neurons whose spike rates are
significantly suppressed by nCRF stimulation are 50% at 2 × CRF,
59% at 3 × CRF, and 73% at 4 × CRF (p
0.01). The suppression of individual neurons is reflected in the
average spike rate of the population, which decreases with increasing stimulus size (Fig. 5D).
In a previous study, we showed that increasing stimulus size
decorrelated the responses of neuron pairs (Vinje and Gallant, 2000
). The decorrelation index measures the relative overlap of the tuning properties for each neuron pair; as neuron pairs become decorrelated the overlap in their tuning functions is reduced. Thus,
for large stimuli, different neurons were unlikely to fire in response
to the same space-time stimulus, whereas for stimuli confined to the
CRF, there was a significant chance of correlated firing.
Increasing nCRF stimulation produces a net increase in suppressive
modulation, a reduction in the overall population activity rate, and a
reduction in tuning overlap. These results support the first untested
prediction: increasing nCRF stimulation reduces metabolic load by
lowering mean spike rates. Furthermore, these three findings suggest
that nCRF stimulation reduces the effective bandwidth of single
neurons, thereby restricting the range of stimuli that they represent.
nCRF stimulation increases information transmission rate
Does this shrinkage in effective bandwidth reduce the amount of
information represented by V1 neurons? If information is lost, then the
stimulus representation will be coarsened rather than made sparser
(Foldiak and Young, 1995
; Olshausen and Field,
1997
; Barlow, 2001
). Information must be
preserved if nCRF stimulation truly increases sparseness. Information
transmission can be preserved in numerous ways. One possibility is that
the overall information transmission rate might be preserved at the
level of individual neurons. Alternatively, some neurons may increase
their information transmission rates while other neurons transmit less information.
Information transmission rates (bits per second) for our sample of V1
neurons are shown in Figure
6A-D. For each neuron at each
stimulus size, we compared information rates observed with and without
nCRF stimulation. Neurons with significantly increased information
rates are shown in black, while those with
significantly decreased rates are shown in white (p
0.01). The effects of natural nCRF stimulation vary across
neurons. Some exhibit decreases in information transmission rates,
whereas others exhibit increases. Interestingly, significant increases
in information transmission rates occur more frequently than
significant decreases. The ratio of significant increases to
significant decreases is 3.8:1 at 2 × CRF, 3.4:1 at 3 × CRF, and 3.7:1 at 4 × CRF.

View larger version (20K):
[in this window]
[in a new window]
|
Figure 6.
Information transmission in V1 neurons increases
with stimulus size. A-D, Information transmission rates for
stimuli of sizes 1 × CRF, 2 × CRF, 3 × CRF, and
4 × CRF. Neurons in which stimulus size significantly increases
information transmission per second are shown in black,
whereas those neurons with significant decreases in transmission are
shown in white (p 0.01). The total number of
neurons in each histogram, nt; the number
of neurons demonstrating significant increases,
n ; and the number of time bins demonstrating
significant decreases, n , are indicated to
the right of each histogram. Significant increases occur in
43, 39, and 45% and significant decreases occur in 11, 14, and 12% of
the neurons for stimulus sizes of 2 × CRF, 3 × CRF, and
4 × CRF, respectively. E, Mean information
transmission per second versus stimulus size. The mean rates obtained
with 2 × CRF and 3 × CRF stimuli are significantly higher
than the mean rate observed with stimuli confined to the CRF
(p 0.05). With stimuli of 4 × CRF diameter the
increase in mean rate is marginally significant (p 0.07).
|
|
For our sample of neurons, the average information transmission rate
also increases with stimulus size (Fig. 6E). The increase in
mean rate is modest but statistically significant for stimulus sizes of
2 × CRF and 3 × CRF (p
0.05) and is
marginally significant for stimuli of 4 × CRF diameter
(p
0.07).
Table 1 shows the average information rate as a function of stimulus
size and time-bin duration. In general, the average rate increases as
time-bin duration decreases. From 50 msec to 4.6 msec, the information
transmission rate increases by ~250%. The increase in information
rates for short binning times is commonly observed in
neurophysiological data sets (Strong et al., 1998
) and
occurs because H(r) increases more rapidly than
H(r|s) as bin duration shrinks.
Our second prediction is that the average information transmission rate
should not decrease as stimulus size increases. Our results demonstrate
that information transmission actually increases with stimulus size.
This is consistent with the predicted preservation of information. It
also suggests that nCRF stimulation may be necessary to fully realize
the information-processing potential of V1 neurons.
nCRF stimulation increases information per spike
As discussed in the introductory remarks, sparse coding offers
several potential advantages to the nervous system. It may simplify
development of neural connections, increase learning rates, and
increase memory capacity (Barlow, 1961
,
2001
). Sparse coding also reduces the
number of action potentials required to represent a scene and thereby
decreases the metabolic demands of information processing
(Srinivasan et al., 1982
; Laughlin et al.,
1998
). If the system is to maintain the fidelity with which a
scene is represented, this reduction in spiking activity must be
accompanied by an increase in the average amount of information each
spike provides about the stimulus. Thus, natural nCRF stimulation should increase the average information carried by each spike.
The average information that a spike transmits about the stimulus is
found by simply dividing the information per second by the mean number
of spikes per second: Ispike = Isec/µ, where µ is the mean spike rate
of the neuron for all stimuli of a given size.
Information transmission per spike is shown in Figure
7A-D. Figure conventions are
identical to those used in Figure 6. Stimulation of the nCRF can
increase or decrease the information per spike, but the trend is
strongly toward increasing the information content of spikes. The ratio
of neurons with significant increases to those with significant
decreases is 6.5:1 at 2 × CRF and 26:1 at 3 × CRF. For data
obtained with stimuli of 4 × CRF diameter, all significantly
modulated neurons show increases in their information transmission per
spike.

View larger version (19K):
[in this window]
[in a new window]
|
Figure 7.
Information per spike increases with stimulus
size. A-D, Information transmission per spike versus
stimulus size. Figure conventions match those used in Figure 6.
Significant increases in information per spike occur in 59, 70, and
79% and significant decreases occur in 9, 3, and 0% of the neurons
for stimulus sizes of 2 × CRF, 3 × CRF, and 4 × CRF,
respectively. E, Mean (black circles) and median
(gray triangles) information per spike estimates as a
function of stimulus size. The mean information per spike obtained with
2 × CRF and larger stimuli is significantly higher than that
observed with stimuli confined to the CRF (p 0.05).
|
|
The mean information per spike also increases substantially as a
function of stimulus size (Fig. 7E, black circles).
For stimuli of 4 × CRF diameter, the mean information per spike
is 1.85 times larger than that of the value obtained with CRF-sized
stimuli. All stimuli of a size
2 × CRF produce significant
increases in information per spike (p
0.05). Because
the information-per-spike distributions are positively skewed, we also
evaluated the median information transmission per spike (Fig.
7E, gray triangles). As expected, the medians increase
less than the means, but still increase significantly for sizes of
3 × CRF (p
0.05). Table 1 presents the
average information per spike as a function of stimulus size and
time-bin duration. As duration decreases, the information per spike
increases in a manner similar to that observed for information per second.
Natural nCRF stimulation increases the information content of each
spike for most neurons in our sample. This confirms the third
prediction of the hypothesis that nCRF stimulation increases sparseness
in V1.
nCRF stimulation during natural vision increases efficiency
As nCRF stimulation increases sparseness, it should also increase
the efficiency of information processing. In information theoretic
terms, efficiency measures the fraction of available bandwidth that a
neuron actually uses to transmit information. Formally this is
expressed as the ratio of the amount of information actually
transmitted over the theoretical maximum amount of information that
could be transmitted (Cover and Thomas, 1991
;
Borst and Theunissen, 1999
):
|
(12)
|
Figure 8A-D shows
efficiency versus stimulus size for our sample of neurons. Figure
conventions again match those used in Figure 6. As stimulus
size increases, so does the efficiency of single neurons. The ratio of
neurons with significant increases to those with significant decreases
is 6.3:1 at 2 × CRF and 26:1 at 3 × CRF. With 4 × CRF
stimuli, all significantly modulated neurons show increases in the
efficiency of information transmission.

View larger version (19K):
[in this window]
[in a new window]
|
Figure 8.
Efficiency of information transmission increases
with stimulus size. A-D, Efficiency versus stimulus size.
Figure conventions match those used in Figure 6. Significant increases
in efficiency occur in 57, 70, and 82% and significant decreases occur
in 9, 3, and 0% of the neurons for stimulus sizes of 2 × CRF,
3 × CRF, and 4 × CRF, respectively. E, Mean
efficiency as a function of stimulus size. Efficiency observed with
2 × CRF and larger stimuli is significantly higher than that
observed with stimuli confined to the CRF (p 0.05).
|
|
Mean efficiency increases with nCRF stimulation (Fig. 8E);
for 4 × CRF-sized stimuli, the mean efficiency is 1.6 times
larger than the value obtained with CRF-sized stimuli. The increases in
mean efficiency are statistically significant for all stimuli of a size
2 × CRF (p
0.05). Table 1 presents the
average efficiency as a function of stimulus size and time-bin
duration. In contrast to information rate and information per spike,
mean efficiency does not change substantially as bin duration
decreases. As bin duration shrinks, increases in H(r)
inflate the apparent information transmission per second and per spike.
However, in the case of efficiency, the denominator of Equation 12
largely cancels this effect.
Neurons use their available transmission bandwidth more efficiently
when the nCRF is stimulated than when stimuli are confined to the CRF.
Because efficiency does not explicitly depend on spike rate, this
result complements the finding that nCRF stimulation increases the
amount of information available in each spike and confirms the last of
our predictions.
Information transmission and efficiency correlate
with selectivity
Thus far we have shown that nCRF stimulation increases information
transmission rates, the information content of single spikes, and
processing efficiency in both individual neurons and our sample population. In a previous study (Vinje and Gallant,
2000
), we showed that nCRF stimulation increases the
selectivity of V1 neurons (Fig. 9). All
of these results are consistent with the idea that nCRF stimulation
increases the sparseness of the representation of visual information in
V1. A supplementary test of the sparse coding hypothesis is to
determine whether selectivity is correlated with information
transmission in individual neurons. If the nCRF increases sparseness,
then cells that show a substantial increase in selectivity contingent
on nCRF stimulation should be more informative and more efficient than
those that do not show such changes.

View larger version (19K):
[in this window]
[in a new window]
|
Figure 9.
Stimulus selectivity increases with stimulus size.
A-D, Selectivity versus stimulus size. Figure conventions
match those used in Figure 6. Significant increases in selectivity
occur in 61, 70, and 88% and significant decreases occur in 11, 5, and
2% of the neurons for stimulus sizes of 2 × CRF, 3 × CRF,
and 4 × CRF, respectively. E, Mean selectivity versus
stimulus size. Selectivity obtained with 2 × CRF and larger
stimuli is significantly higher than that found with stimuli confined
to the CRF (p 0.05).
|
|
Stimulus selectivity is not significantly correlated with information
per second in our sample of cells (Fig.
10A-D). However, selectivity is significantly correlated with information per spike for
all stimulus sizes (Fig. 10E-H; p
0.01). The
correlations between information per spike and selectivity are 0.91, 0.90, 0.89, and 0.89 for CRF-, 2 × CRF-, 3 × CRF-, and
4 × CRF-sized stimuli, respectively. Finally, stimulus
selectivity is also significantly correlated with efficiency (Fig.
10I-L; p
0.01). Correlations between efficiency and
selectivity are 0.89, 0.89, 0.87, and 0.88 for CRF-, 2 × CRF-,
3 × CRF-, and 4 × CRF-sized stimuli, respectively.

View larger version (37K):
[in this window]
[in a new window]
|
Figure 10.
Information per spike and efficiency are
correlated with selectivity. A-D, Information transmission
rate versus selectivity for stimuli 2 × CRF, 3 × CRF, and
4 × CRF. The lack of meaningful correlation is immediately
apparent. E-H, Information per spike versus selectivity.
These measures are strongly correlated for all stimulus sizes.
I-L, Efficiency versus selectivity. The measures are also
correlated for all stimulus sizes.
|
|
The lack of correlation between information transmission per second and
selectivity suggests that the observed increases in information rate
may not be central to the process of increasing sparseness. This is
perhaps unsurprising, given that the prediction was merely that average
information transmission should be preserved. In contrast, the
correlation of selectivity with information per spike and efficiency
suggests that these three measures are related by an underlying causal
factor. It seems likely that this causal factor is the sparseness of
information representation in V1. As sparseness increases, there are
corresponding increases in selectivity, information per spike, and
efficiency of information transmission.
Results obtained with flashed natural-image patches
Natural-vision movies are designed to mimic the stimulation that
occurs during saccadic vision of a static scene. The majority of the
movie consists of fixations where image content is held constant. These
fixations are linked by simulated saccades with realistic acceleration
profiles. The stimulation contained in saccades blends together image
patches and avoids any discontinuous change in stimulus content. Most
previous physiology experiments use flashed stimuli that contain
instantaneous onset and offset transitions and substantial
interstimulus intervals. Clearly the nature of the transitions between
image patches is very different in these two procedures.
To facilitate comparisons between our results and those obtained using
flashed stimuli, we performed the following control experiment. Image
patches were selected from natural scenes and presented as flashed
stimuli (n = 10 neurons; see Materials and Methods).
Responses to the flashed stimuli were concatenated to form a
pseudo-movie and analyzed in the same manner as natural-vision movies
(responses during interstimulus intervals were discarded).
Unfortunately, flashed stimulus patches were presented only five times;
therefore, this data set suffers from larger entropy biases than our
main data set. This problem is partially caused by difficulty in
accurately estimating the second correction term in Equation 10 and was
partially ameliorated by using only the linear correction term. To
enable comparison with data from natural-vision movies, we also limited
the natural-vision data to the first five trials and applied only the
linear bias correction. This approach subjected both data sets to the
same bias-producing conditions and thus allowed a fair comparison
between the results from the natural vision and the flashed stimuli.
The effects of nCRF stimulation with flashed stimuli are generally
similar to those obtained with natural-vision movies. Both the
information per spike and the efficiency increase with stimulus size.
For the largest flashed stimulus size (3 × CRF), the average information per spike increases by 25% and the average efficiency increases by 10%. Information transmission per second does not increase.
These results suggest that increasing stimulus size increases response
sparseness with flashed stimuli, as it does with natural-vision movies.
However, these effects are somewhat smaller with flashed stimuli. This
may be an artifact of small sample size, because not all neurons
demonstrate strong nCRF modulation effects. Alternatively, this may
reflect differences in the transient responses evoked by the two
stimulus classes. When the first 200 msec are removed from the response
to each flashed-image patch, information transmission more closely
matches that obtained with natural-vision movies.
Total entropy and noise entropy both decrease with increasing
stimulus size
Information is the difference between two measures of variance,
the total response entropy and the noise entropy. An increase in
information can reflect a decrease in noise entropy, an increase in
total entropy, or some combination of the two. Each of these changes
would alter neuronal spiking patterns in ways that allow insight into
the specific biophysical mechanisms underlying nCRF modulation. If nCRF
stimulation increases the total response entropy, then the nCRF must
increase the dynamic range of the neuron and/or the reliability of
spikes elicited by the stimulus. In contrast, if the noise entropy
decreases consequent to nCRF stimulation, then the nCRF must suppress
spikes that are not relevant to encoding the stimulus.
Entropy measures are summarized in Figure
11. Figure 11A-D shows
total stimulus entropy, and Figure 11E-H shows noise
entropy. Those neurons with significantly increased entropies are shown in black, while those with significantly decreased entropies
are shown in white (p
0.01). It is readily
apparent that the nCRF has a large effect on both total entropy and
noise entropy. On average, both total entropy and noise entropy
decrease with nCRF stimulation. However, the noise entropy falls faster
than the total entropy.

View larger version (20K):
[in this window]
[in a new window]
|
Figure 11.
Stimulation of the nCRF decreases both
total response entropy and noise entropy. A-D, Total
response entropy versus stimulus size. Significant increases in total
response entropy occur in 30, 16, and 14%, and significant decreases
occur in 50, 57, and 71% of the neurons for stimulus sizes of 2 × CRF, 3 × CRF, and 4 × CRF, respectively.
E-H, Noise entropy versus stimulus size. Significant
increases in noise entropy occur in 18, 14, and 8% and significant
decreases occur in 55, 65, and 80% of the neurons for stimulus sizes
of 2 × CRF, 3 × CRF and 4 × CRF, respectively. Larger
stimuli tend to reduce both the total response entropy and the noise
entropy of individual neurons.
|
|
The differential effect of nCRF stimulation on these two entropies
underlies the observed increases in information rate, information per
spike, and efficiency. The simultaneous decrease of both total and
noise entropies explains why nCRF stimulation has a relatively weak
effect on information per second: such stimulation decreases both total
entropy and noise entropy and dilutes the effective increase in overall
information transmission rates.
Information per spike and efficiency are both ratio measures with the
weakly increasing information rate in their numerators. However, the
denominator terms of both measures (µ and H(r),
respectively) shrink with increasing stimulus size. This convergence of
a weakly increasing numerator and a decreasing denominator underlies
the strong increases in information per spike and efficiency as a function of stimulus size. The nCRF appears to suppress most responses and enhance a select few. As sparseness increases, those action potentials that are not reliably linked to stimulus properties are
winnowed from the responses of the neuron.
 |
DISCUSSION |
Our results show that nCRF stimulation changes the response
entropies of V1 neurons. Stimulation of the nCRF decreases total response entropy but has an even greater effect on the noise entropy. This differential modulation underlies the pattern of results we
observed: relative to CRF stimulation alone, naturalistic nCRF stimulation increases selectivity, information per second, information per spike, and efficiency.
Previous theoretical research has shown that the informative components
of natural scenes are sparsely distributed (Field, 1987
;
Olshausen and Field, 1996
, 1997
; Bell and Sejnowski, 1997
). Our
results suggest that the nCRF might tune V1 neurons to match the
sparsely distributed, informative components of natural scenes. The
resulting neural code is also sparse, highly selective, and efficient.
The level of sparseness in the neural code does not necessarily match
that of natural images; the sparse components of natural images are determined by the physical structure of the world, while the
sparseness of a neural code also reflects biophysical, computational,
and behavioral constraints (van Hateren and Ruderman, 1998
). Therefore, the neural code might be more or less sparse than would be expected based simply on the statistics of natural images. Future res