 |
Previous Article | Next Article 
The Journal of Neuroscience, November 15, 2002, 22(22):9945-9960
Processing of Natural Temporal Stimuli by Macaque Retinal
Ganglion Cells
J. H.
van Hateren1,
L.
Rüttiger2, 3,
H.
Sun4, and
B. B.
Lee2, 4
1 Department of Neurobiophysics, University of
Groningen, 9747 AG Groningen, The Netherlands, 2 Department
of Neurobiology, Max-Planck-Institute for Biophysical Chemistry, 37077 Göttingen, Germany, 3 Tübingen Hearing Research
Center, University Clinics, 72076 Tübingen, Germany, and
4 State University of New York College of Optometry, New
York, New York 10036
 |
ABSTRACT |
This study quantifies the performance of primate retinal ganglion
cells in response to natural stimuli. Stimuli were confined to the
temporal and chromatic domains and were derived from two contrasting
environments, one typically northern European and the other a flower
show. The performance of the cells was evaluated by investigating
variability of cell responses to repeated stimulus presentations and by
comparing measured to model responses. Both analyses yielded a quantity
called the coherence rate (in bits per second), which is related to the
information rate. Magnocellular (MC) cells yielded coherence rates of
up to 100 bits/sec, rates of parvocellular (PC) cells were much lower,
and short wavelength (S)-cone-driven ganglion cells yielded
intermediate rates. The modeling approach showed that for MC cells,
coherence rates were generated almost exclusively by the luminance
content of the stimulus. Coherence rates of PC cells were also
dominated by achromatic content. This is a consequence of the stimulus
structure; luminance varied much more in the natural environment than
chromaticity. Only approximately one-sixth of the coherence rate of the
PC cells derived from chromatic content, and it was dominated by
frequencies below 10 Hz. S-cone-driven ganglion cells also yielded
coherence rates dominated by low frequencies. Below 2-3 Hz, PC cell
signals contained more power than those of MC cells. Response variation between individual ganglion cells of a particular class was analyzed by
constructing generic cells, the properties of which may be relevant for
performance higher in the visual system. The approach used here helps
define retinal modules useful for studies of higher visual processing
of natural stimuli.
Key words:
retinal ganglion cells; magnocellular; parvocellular; natural stimuli; information theory; macaque
 |
INTRODUCTION |
There is growing interest in the way
the visual system processes natural stimuli. Theoretical studies have
used the statistical properties of stimuli from natural environments to
predict spatial, temporal, and chromatic properties of various stages
in visual processing (Srinivasan et al., 1982 ; Field, 1987 ; Atick,
1992 ; van Hateren, 1993 ; Dong and Atick, 1995 ; Olshausen and Field, 1997 ; van Hateren and Ruderman, 1998 ; for review, see Simoncelli and
Olshausen, 2001 ). Natural, or at least naturalistic, stimuli have been
used to physiologically investigate system function under normal
environmental conditions. Species studied have ranged from
invertebrates (Laughlin, 1981 ; van Hateren, 1992 ; Passaglia et al.,
1997 ; Kern et al., 2001 ; Lewen et al., 2001 ; van Hateren and Snippe,
2001 ) through nonmammalian vertebrates (Vu et al., 1997 ; Berry, 2000 )
to mammals (Dan et al., 1996 ; Baddeley et al., 1997 ; Stanley et al.,
1999 ; Vinje and Gallant, 2000 ). Study of primates is of particular
interest in that they are the only mammals with trichromatic vision
(Jacobs, 1993 ), and the visual capabilities of Old World primates are
close to those of human. The macaque retina is a suitable locus for
such a study, because ganglion cell types and their receptor and
bipolar inputs are physiologically and anatomically well characterized
(Kaplan et al., 1990 ; Dacey, 2000 ), and this can aid interpretation of
responses to natural scenes.
Although our final goal is a full spatiotemporal and chromatic analysis
of ganglion cell responses to natural stimuli, we begin with a simpler
stimulus, a spatially homogenous field modulated only in time and
spectral properties. The results are conceptually and computationally
easier to analyze than those of full spatiotemporal stimuli, because
the stimulus contains only two (time and spectrum) rather than four
dimensions (when two spatial ones are added). Furthermore, many complex
properties of the visual system, such as luminance and contrast gain
controls, are already present in the time domain. We here attempt to
capture responses to naturalistic stimuli in these dimensions, before
attempting a full spatiotemporal model.
We used two different examples of a temporal stimulus, which we call
chromatic time series of intensities (CTSIs). One was derived from a
typical northern European environment, and the other was recorded from
a flower show, which provided a different distribution of
chromaticities (see Fig. 1). Stimuli were presented while responses
were recorded from magnocellular (MC), parvocellular (PC), or short
wavelength (S) cone-driven ganglion cells. We showed that linear models
do not describe responses to natural stimuli well and developed
nonlinear models that perform more satisfactorily. These models are
developed for two main purposes. First, they allow us to analyze and
quantify how information on luminance and spectral aspects of the
stimuli are distributed among the different classes of ganglion cells.
Second, they form a step toward the development of full spatiotemporal
models that could be used as preprocessing modules for studies of
higher visual processing.
 |
MATERIALS AND METHODS |
Preparation and recording. Ganglion cell activity was
recorded from the retina of the anesthetized macaque (Macaca
fascicularis). The animals were initially sedated with an
intramuscular injection of ketamine (10 mg/kg). Anesthesia was
maintained with inhaled isoflurane (0.2-2%) in a 70:30
N2O/O2 mixture. Local
anesthetic was applied to points of surgical intervention. EEG and
electrocardiogram were monitored continuously to ensure animal
health and adequate depth of anesthesia. Muscle relaxation was
maintained by a constant infusion of gallamine triethiodide (5 mg · kg 1 · hr 1,
i.v.) with accompanying dextrose Ringer's solution (5 ml/hr). Body
temperature was kept close to 37.5°. End tidal
CO2 was adjusted to close to 4% by adjusting the
rate of respiration. All procedures were approved by the State of Lower
Saxony Animal Welfare Committee and the Animal Care Committee of State
University of New York College of Optometry.
A tungsten-in-glass recording microelectrode was introduced to
the retina via a scleral hole using established techniques. The details
of the preparation can be found in Lee et al. (1989) . The location of
the receptive field of each cell was mapped onto a tangent screen 114 cm from the eye. Cell identification was achieved using a battery of
tests including chromatic sensitivity and time course of responses and
other tests shown to reliably distinguish between MC and PC cells and
those with S-cone input (Lee et al., 1989 ). Eccentricity of receptive
fields ranged between 5 and 15°. The results presented in this
article are based on 42 ganglion cells recorded from six animals.
Partial measurements on another 35 cells from nine animals were fully
consistent with those reported here.
Stimuli. Measurements on retinal ganglion cells were
performed with two different naturalistic stimuli ("laboratory
environment" and "flower show") that were measured in two
alternative environments, using different measurement equipment and
different equipment to present the stimuli to the macaque retina.
The laboratory environment stimulus was recorded near the laboratory of
one of the authors (Groningen, August). This environment consisted of many shades of green and brown (bushes, a variety of
plants, grass, soil) but also contained flower beds and some manmade
materials (pavement, concrete, buildings). The environment was scanned
during walking with a hand-held optical device consisting of a lens
focused onto a pinhole in front of a light guide. The resulting angular
sensitivity of the detector had a full width at half-maximum of 8.7 arc
min. The light was split (through a dichroic mirror, a
half-silvered mirror, and spectral filters) into three chromatic
channels, each equipped with a photomultiplier (Hamamatsu H5701-50). By
combining filters (Edmund Optics), we tuned the three chromatic
channels to approximately match the spectral sensitivities of the long
(L), middle (M), and short wavelength (S) cones. A linear
transformation of the three photomultiplier outputs was then used to
improve the fidelity of the cone excitations.
During the sample period, signals from the photomultipliers were
recorded on a portable DAT-recorder (Sony PC-208A). The resulting three
signals were down-sampled and transformed to be presented on a
Maxwellian view system with three light-emitting diodes (LEDs) with dominant wavelength 460, 554, and 638 nm (Lee et al., 1990 ). LED
intensity was driven by a frequency-modulated pulse train that gave a
highly linear output. Stimuli were presented at a sample rate of 400 Hz
with 12-bit resolution. A 4.7° homogenous stimulus field was used.
The duration of the CTSI was either 1 or 10 min. Results for 1 and 10 min presentations were very similar. The CTSI was typically repeated
six times, with each repeat preceded by a period of steady
illumination. There was generally no systematic change in responses
from the first to the last repeat, indicating that the state of cells
was stationary. Because the three LEDs of the Maxwellian view system
did not completely span the recorded color space, the stimulus had to
be modified. For cells receiving input from only the L- and M-cones (MC
and PC cells), the appropriate combinations of M- and L-cone
excitations could be achieved by modulation of all three diodes, S-cone
excitation being allowed to vary. For cells with S-cone input, the
diode outputs were adjusted to provide the appropriate (M + L) signal.
This is a physiologically reasonable procedure, because the S-cone
antagonistic L-, M-cone inputs have been shown to sum linearly (Smith
et al., 1992 ). Figure 1, A, C,
and D, shows several basic characteristics of the stimulus. In Figure 1A, a scatter diagram of the chromaticity
coordinates is shown; in Figure 1D, the distribution
of illuminances is shown; and in Figure 1C, the illuminance
power spectrum normalized by the average illuminance of the stimulus
(1179 td) is shown.

View larger version (27K):
[in this window]
[in a new window]
|
Figure 1.
Characteristics of the stimuli. A,
x-y chromaticity coordinates of the
stimulus recorded in the environment of the laboratory and played back
on the LEDs of the Maxwellian view. B,
x-y chromaticity coordinates of the
flower show, as presented on the monitor. C, Normalized
power spectra of the two stimuli. The power spectra were normalized by
dividing by the square of the average illuminance of the stimuli, 1179 td for the LEDs (laboratory environment) and 222.2 td for the display
(flower show). Frequencies are smoothed (averaged with a group of
neighboring frequencies, with the group size proportional with
frequency). D, E, Histograms of
illuminance values of the stimulus from the laboratory environment and
the flower show. F, Comparison of achromatic
(a) and chromatic contrast
(clm,
cs) in the flower show stimulus; see
Materials and Methods for details.
|
|
The flower show stimulus was recorded at the Westfriese Flora
(Bovenkarspel, The Netherlands), which is claimed to be the world's
largest indoor flower show. We recorded a movie with a digital video
camera (JVC GR-DVL9600) while walking through the exhibition. The
camera was used in progressive scan mode, at 25 frames per second
(fps). The camera was held steady, either with only unintentional
manual vibration or with deliberate manual displacements and smooth
scans. Every 2-3 sec a shift of varying angle was made toward a new
camera heading. The movie was presented to the monkey six times faster
than recorded (see below), and so there were effectively two to three
gaze shifts per second in the stimulus. This recording procedure was an
attempt to roughly mimic typical eye movements. The recorded movie was
transported to a PC and stored as separate frames in a noncompressed
format. Although the movie was intended primarily for a full
spatiotemporal analysis of ganglion cell performance (our unpublished
results), we reduced it to a temporal stimulus for the present
purpose. This was done by averaging the effective L-, M-, and S-cone
illuminances produced by the display over a circular weighting profile
shaped as a cosine in the interval  /2 to /2 (full diameter 15 arc min, positioned in the center of the movie). The display was driven to produce these illuminances over a field of 4.6° × 4.6°,
of which the contrast was tapered with a Kaiser-Bessel window to reduce
potential edge effects. The stimulus was viewed through a 4 mm
artificial pupil. The movie was compressed to an mpeg-1 movie at
25 fps and displayed at 150 fps on a PC with Windows 98SE by using
Microsoft Mediaplayer 6.4 controlled by a script increasing the
displayed frames per second sixfold. The PC had a dual-head display
video card (Matrox G400), with a dedicated display for stimulation
(Iiyama Vision Master Pro 410, running at a resolution of 640 × 480 at 150 Hz refresh rate). The CTSI movie had a duration of 1 min and
was typically repeated six times during a neural recording. Each repeat
was preceded by an equal energy white of the same mean illuminance as
the movie. Again, we found that there was generally no systematic
change in response from the first to the last repeat.
Synchronization with the data acquisition was provided by
synchronization pulses carried by the audio track of the movie. The
display used for stimulation was gamma corrected with a calibrated photomultiplier; spectral calibration was performed with an Ocean Optics spectrometer. Because the mpeg compression can change the illuminances somewhat, the calibrations were not performed on the
original frames but on the frames resulting from decompressing the mpeg
movies. Note that the entire calibration procedure deals only with the
stimulus as actually delivered to the macaque retina; no attempt was
made to calibrate, for example, the video camera (which uses automatic
gain control, digital compression, and spectral properties deviating
from those of the cones). Thus, the stimulus on the display is expected
to only approximate the real one at the flower show. Therefore, this
stimulus is different in this respect from the one recorded in the
environment of the laboratory and presented with the LEDs, because for
the latter we reproduced the stimulus as actually present in the
natural environment. Figure 1, B, E,
C, and F, shows characteristics of this stimulus:
a scatter diagram of chromaticity coordinates (Fig.
1B), which differed substantially between the
environments, the distribution of illuminances (Fig.
1E), and the illuminance power spectrum normalized by
the average illuminance, 222.2 td (Fig. 1C). Differences in
the distribution of illuminances between the two CTSIs are caused
partly by genuine differences between the environments and partly by
nonlinearities in the video camera and display. Figure
1F compares achromatic to chromatic contrast in the
flower show stimulus. For this calculation the L-, M-, and S-cone
illuminances (l, m, and s; see below)
were transformed similarly to the scheme of Ruderman et al. (1998) , with, e.g., = logl logl , where the logarithm has base e, and
. denotes averaging over the time series. The achromatic signal is then defined as a = ( + )/ 2, and two
different chromatic signals as clm = ( )/ 2 and
cs = (2s ( + ))/ 6. The curves in
Figure 1F are the amplitude spectra of these signals.
Similar curves were obtained for the CTSI from the laboratory environment.
For both the laboratory environment and flower show stimulus, we
calculated L-, M-, and S-cone illuminance (l, m,
and s, trolands) for input to the models. The signals
l, m, and s were determined from the
Smith/Pokorny cone fundamentals (Smith and Pokorny, 1975 ), which are
defined such that illuminance is given by l + m,
whereas s is normalized with respect to an equal energy
white (Boynton and Kambe, 1980 ).
Data evaluation. All calculations in this article were
standardized to a time resolution of 1 msec. Stimuli presented at 400 and 150 Hz were interpolated to 1 kHz, and spike times recorded at 10 kHz resolution were reduced to 1 msec bins. A time resolution of 1 msec
provides a frequency bandwidth of 500 Hz.
Expected coherence (Haag and Borst, 1998 ) and expected coherence rate
(van Hateren and Snippe, 2001 ) were computed as follows. From the
responses i(t) to m
stimulus repeats, the average:
|
(1)
|
is calculated. The power spectrum of (t) is
Sraw, a (biased) estimate of the
signal power spectrum. For each response, the deviation
i(t) (t) is calculated; its power spectrum is
Ni. Then
Nraw = i = 1mNi/m
is a (biased) estimate of the noise power spectrum. Unbiased estimates
of signal and noise power can be obtained (van Hateren and Snippe,
2001 ) as = Sraw Nraw/(m 1) and
= Nraw
m/(m 1), which yields the signal-to-noise
ratio (SNR):
|
(2)
|
The expected coherence is (Haag and Borst, 1998 ):
|
(3)
|
and the expected coherence rate is:
|
(4)
|
where the integral extends to a frequency
f0 where the coherence has become
zero. Because the SNR and thus  is unbiased
through Equation 2,  fluctuates around zero
for high frequencies (see Figs. 4B, 6, 9), and
Rexp(f0) becomes essentially flat for sufficiently high
f0. Thus the choice of
f0 is not critical, as long as it is
high enough.
Models were evaluated by calculating the coherence
 between model response (i.e., the transformed
stimulus, smod, calculated at a
resolution of 1 msec) and measured response (see Fig. 5), with:
|
(5)
|
where the brackets denote ensemble averaging over the spectra
smod of different time stretches of
the model response and the spectra r of the corresponding
response stretches; * denotes the complex conjugate, and is the
angular frequency. The numerator is the power of the cross-spectrum of
model response and measured response; the denominator is the product of
their power spectra. If the number of different time stretches
n is not large,  is biased, which can
be corrected by assuming that r can be written as
r( ) = p( ) + ), with
independent noise. Then the calculated  for n stretches of r and
s yields:
|
(6)
|
whereas:
|
(7)
|
Note that the coherence between r and
smod (Eq. 5) is the same as the
coherence between r and r' (see Fig. 5). This can
be easily seen by writing r' = W · smod, with W the
transfer function of the Wiener filter. W will then cancel
from the numerator and denominator of the coherence of r and
r', which then reduces to Equation 5.
The coherence rate Rcoh for
2 is defined as:
|
(8)
|
The coherence rates defined in Equations 4 and 8 are formal
definitions, which are valid for any coherence regardless of whether
the system is linear and whether the signals are Gaussian and
independent. The coherence rate quantifies, with a single number, how
close the coherence function is to 1 over the entire frequency axis.
For the interpretation of the coherence rate, however, it is important
to note that the coherence itself addresses only the linear
relationship between two signals. For a further discussion of the
formal use of the coherence rate and its relation to the information
rate, see van Hateren and Snippe (2001) .
Parameters of a particular nonlinear model were varied (using a simplex
optimization algorithm) (Press et al., 1992 ) to maximize Rcoh. The form of the models was
varied, essentially by selecting and tuning individual elements, to
bring Rcoh as close as possible to
Rexp. Coherence functions and
responses were generally calculated for the same full stretch of data
as used for fitting the parameters of each model. As a control against
overfitting, we also calculated coherence functions and
responses for different parts of the stimulus, or different
repeats, than those used for the fitting procedure and found the
results to be virtually identical.
The response r' (see Fig. 5) follows from:
|
(9)
|
where the quotient is the filter minimizing the (rms) error
between r and r' (Theunissen et al., 1996 ). This
filter will be designated as "Wiener filter" below (Papoulis,
1977 ). It is the cross-spectrum of measured response and model response
normalized by the power spectrum of the model response. Because the
measured response contained much power at high frequencies (spikes are temporally sharp), the cross-spectrum also extended to high
frequencies. For 2 this was
automatically compensated by the power spectrum rr* , which also extended to high frequencies. This resulted in coherence functions (see Figs. 4, 6, and 9) that have low-pass characteristics, without the application of additional low-pass filtering.
However, this high-frequency compensation did not work for
r' as in Equation 9, because the denominator with the power
spectrum of the model response was in fact small for high frequencies
(as is the stimulus from which the model response derives). To exclude
the possibility that the constructed response r' (see Figs.
2, 3, and 8) was dominated by high-frequency noise, it was necessary to
low-pass filter the response r. This was done by a cascade
of eight first-order low-pass filters, each with a time constant
= 2 msec (for MC cells) and = 4 msec (for PC cells
and S-cone cells); the resulting filters have impulse responses with
full widths at half-maximum of 12.5 and 25 msec, respectively,
corresponding to cutoff frequencies (at 50% of the maximum amplitude)
of 34 and 17 Hz. For the model development, low-pass filtering was
immaterial, because parameter values and coherence functions were
virtually identical with or without this filtering. Coherence functions
and coherence rates presented in this article were calculated without
low-pass filtering. Furthermore, for interpreting the constructed
responses r' as in Figures 2, 3, and 8, the filtering was
not critical, at least within the present framework of analysis,
because the low-pass filter essentially filters away only those
frequencies where the coherence is close to zero.
Information rates can be obtained from normalized spike rates (Brenner
et al., 2000 ); see Equation 12. The spike rate can be calculated as the
average response (t) (Eq. 1), but for small numbers
of repeats this will be noisy. Let us assume that, in the frequency
domain, the response can be written as r( ) = p( ) + ( ), with r the Fourier transform
of i(t), p the Fourier
transform of the underlying spike rate that we want to estimate, and
independent noise. The Wiener estimate of p based on the
average of m repeats is then
p = (S p/S ) . The expectation value of the cross-spectrum is
S p = p2, and that of the power
spectrum is S = p2 + 2/m. Therefore, an estimate
of p is obtained as:
|
(10)
|
with the SNR given by Equation 2. Transforming
to the time domain then gives an estimate of the
spike rate, (t), as used in Equation 12. The factor
multiplying in Equation 10 is a low-pass filter. It
was smoothed by block averaging with a width proportional to the
frequency to prevent fluctuations of the filter at high frequencies
affecting the estimate of Equation 12.
 |
RESULTS |
Below we give examples of responses of macaque retinal ganglion
cells to a CTSI. Next we describe the expected coherence and coherence
rate of individual cells, based on repeated stimulus presentations. For
the various classes of retinal ganglion cells we then develop models
that produce a coherence rate as close as possible to that inferred
from response repeatability. Finally, we introduce the concept of a
generic cell and proceed to analyze how the retinal cells distribute
among themselves information on luminance and chromatic aspects of the stimulus.
Examples of responses
Figure 2 shows responses of an
on-center MC cell to a 3 sec stimulus segment from the laboratory
environment CTSI. Each response is shown as a spike train (short
vertical bars) and, for presentational purposes, as a filtered
version that gives an estimate of local spike rate (see Materials and
Methods). The average of these local spike rates is also shown. It can
be seen, both from the local rates and from the spike trains, that
responses are similar but not identical. The traces marked
m1-m3 are
model calculations that will be discussed in a later section.

View larger version (34K):
[in this window]
[in a new window]
|
Figure 2.
Examples of responses of an on-center MC cell and
model responses. The top eight rows of spike rates show
eight different responses of the same cell to the stimulus shown at the
top (3 sec of a 10 min stimulus; laboratory
environment). The short vertical bars show the timing of
individual spikes; local spike rate was estimated here by filtering
this spike train with a low-pass filter with a full width at
half-maximum of 12.5 msec. Bottom rows show the average
of the eight local spike rate traces, the response of a linear model
(m1), the response of a model with a
bandpass filter, a compressive nonlinearity, and a rectification
(m2), and the response of the model
shown in Figure 7A
(m3). Parameters for model
m3 (see the legend of
Fig. 7A) were 1 = 6.9 msec,
2 = 60 msec, k1 = 1.2 · 10 2
td 1/2, + = 10 msec,
c1 = 9.5 · 10 3,
q0 = 0.57, q1 = 8.7, c2 = 1.8 ·
10 4, 3 = 208 msec, and k2 = 1.1.
|
|
Examples of responses of a +L-M PC cell and a +S-ML cell are
given in Figure 3. The top two
panels show the illuminance of the stimulus and two measures of
its spectral properties. The four rows marked +L-M give
spike trains and local spike rates of the on-center PC cell. The cell
responds clearly to increases in the l m
difference signal. The stretch of stimulus shown was selected to
include several such increases, but for the entire time series they
were relatively rare. For much of the time, this cell responded mainly
to changes in luminance.

View larger version (35K):
[in this window]
[in a new window]
|
Figure 3.
Examples of responses of a PC and
small-bistratified cell, and model responses. The top
panel shows the stimulus (overlapping with that of Fig. 2);
(l-m)/(l+m) shows the normalized
difference of L- and M-cone excitation;
(s-0.5(l+m))/(l+m) shows the difference
of S-cone excitation and the excitations of the other cones. The
top four rows of spike rates show responses of a PC cell
(+L-M on-center) to this stimulus, with spike trains as in Figure 2;
the local spike rate was estimated from the spike trains using a
low-pass filter with a full width at half-maximum of 25 msec. The
next row shows the average of these four responses. The
row marked +L-M/m3
shows the response of the model in Figure
7B, with parameters (see the legend of Fig.
7B) 1 = 6.5 msec,
2 = 82 msec, k1 = 1.7 · 10 2
td 1/2, q = 0.07, = 1.3, k2 = 32, and
o1 = 0.45. The four rows
marked +S-ML show responses of a small-bistratified
cell, the next row shows their average, and the
bottom row (+S-ML/m) shows the response
of the model in Figure 7C, with parameters (see the
legend of Fig. 7C) 1 = 8.0 msec,
2 = 60 msec, k1 = 1.2 · 10 2
td 1/2, q1 = 0.5, q2 = 0.5, = 0.5, = 0.5 , 3 = 200 msec,
g1 = 1.28, g2 = 17, o1 = 1.5,
o2 = 0.11, = 1.3, k2 = 6.0.
|
|
The traces marked +S-ML in Figure 3 give responses of an
S-cone excitatory cell. This cell responded well to increases of s relative to l + m and is suppressed
when the stimulus shifts to longer wavelengths. The stimulus segment
shown was again selected to include large fluctuations of S-cone
excitation; when this was low, these cells fired at low rates. They
also responded to luminance changes, but in general less vigorously
than PC cells.
Coherence and models of individual cells
Expected coherence
From spike trains as in Figures 2 and 3, it was possible to
quantify the repeatability of responses, to obtain a measure of the
relation of signal to noise, and then to derive the capacity of each
neuron to transmit information. Figure
4A shows the analysis procedure, which was based on the method of Haag and Borst (1998) for
graded potential neurons [see also Borst and Theunissen (1999) and van
Hateren and Snippe (2001) ]. The averaged response is an estimate of
the "signal," from which the signal power spectrum is calculated.
The averaged response is subtracted from each individual response to
give a residual that can be considered as "noise." Averaging the
power spectra of these residuals gives an estimate of the noise power
spectrum. The SNR is the ratio of signal power spectrum to noise power
spectrum. For small numbers of repeats it will be biased because the
estimated signal power spectrum will contain some noise power, and the
estimated noise power spectrum will contain some signal power. This can
be corrected by a bias factor (see Materials and Methods).

View larger version (39K):
[in this window]
[in a new window]
|
Figure 4.
Computation and examples of expected coherence.
A, The expected coherence is calculated from the SNR
estimated from the responses to stimulus repeats. From the average
response the signal power spectrum is calculated; the difference
between each response and the average yields noise power spectra, which
are subsequently averaged. The SNR is the ratio of the signal and noise
power spectra. B, Examples of expected coherence
functions of two on-center MC cells (one for the stimulus obtained in
the environment of the laboratory, and one for the stimulus recorded at
the flower show), and similarly for two PC cells (+L-M on-center
cells) and two small-bistratified cells (+S-ML
cells). Expected coherence rates corresponding to these
coherence functions are 55 and 114 bits/sec for the MC cells, 12 and 39 bits/sec for +L-M, and 32 and 55 bits/sec for +S-ML.
|
|
A measure of response repeatability, the expected coherence
 , follows from  = SNR/(SNR + 1), assuming noise is additive (Haag and Borst, 1998 ). Thus
 approaches 1 when the SNR approaches infinity,
 = 0.5 when SNR = 1, and
 = 0 when SNR = 0. A useful quantity that
sums the behavior of  over the frequency domain
is the expected coherence rate
Rexp (Eq. 4). This is identical,
through the equation relating  and SNR, to
Shannon's equation for the information rate in a channel with Gaussian
signals and noise, Rinf = log2(1 + SNR)df.
Rexp is therefore expressed in bits/sec. Here neither signals nor noise is Gaussian, thus
Rexp cannot be expected to give an
unbiased estimate of the information rate (see Information rates,
below). To stress this qualification, we use the term "coherence
rate" rather than "information rate" for
Rexp and related quantities.
The coherence between two signals (here between the "true,"
noise-free response and each measured response) quantifies, on a scale
of 0-1, how strongly the two signals are (linearly) related for each
frequency. If the coherence is 1 at a particular frequency, there is no
noise and the frequency components of the two signals can be linearly
predicted from one another. Noise will decrease the coherence. A
coherence of 0 means the signals are not linearly related at that frequency.
Examples of expected coherence functions are shown in Figure
4B for several cell classes and for both CTSIs. Note
that the coherence functions shown here and below have inherent
low-pass characteristics (see Materials and Methods); no explicit
low-pass filtering on the raw spike trains was used here. Coherences of MC cells (such as the on-center cell shown) were larger and extended to
higher frequencies than those of PC cells (such as the +L-M on-center cells) and the small-bistratified cells (+S-ML
cells). Coherences obtained with the flower show CTSI are
higher than those obtained with the laboratory environment CTSI. The
former are close to zero above 75 Hz, because of the limitation of the frame rate of the display (150 fps). Although the coherence of MC cells
stimulated with LEDs driven at 400 samples per second (laboratory
environment) extends to frequencies >100 Hz, it is low for frequencies
above 75 Hz. This suggests that the frame rate of the display used for
the flower show stimulus does not strongly limit the coherence rates
obtained with this stimulus. The coherence rates corresponding to the
coherence functions in Figure 4B are 55 and 114 bits/sec for the two CTSIs for the on-center MC cells, 12 and 39 bits/sec for the +L-M cells, and 32 and 55 bits/sec for the +S-ML cells.
Model development and optimization
Coherence functions and coherence rates can also be obtained
between the stimulus and the response. For Gaussian signals and noise,
the coherence rate between stimulus and response is identical to the
information rate derived from the stimulus reconstruction method
described by Bialek et al. (1991) and formulated in the frequency
domain by Theunissen et al. (1996) . The coherence is the cross-power
spectrum of the two signals normalized by their power spectra. Here we
do not reconstruct the stimulus from the response but construct the
response from the stimulus. We also extend the analysis to include
nonlinear models; Figure 5 shows the
method (van Hateren and Snippe, 2001 ). A nonlinear model transforms the
stimulus into a signal smod. The
Wiener filter is the optimal constructing filter as defined in Equation 9. Computing the coherence 2 and
coherence rate Rcoh =  log2(1 2)df between
smod and an actually measured response
r then quantifies how well the model performs compared with
the real system (the retinal ganglion cell).

View larger version (11K):
[in this window]
[in a new window]
|
Figure 5.
Coherence between cell response and model
response. Coherence and coherence rate are calculated between the
neuronal response and the output of a nonlinear model;
s, smod,
r, and r' are functions of
frequency.
|
|
Ideally, the model should perform as does the cell itself. The
performance of the cell itself was quantified above, namely as its
expected coherence rate, Rexp, i.e.,
the expectation value of the coherence rate between the "true"
response of the cell (i.e., without noise) and actually measured
responses. We can thus adopt the following strategy (van Hateren and
Snippe, 2001 ) for finding an adequate model. The parameters of a
particular model are varied to maximize its coherence rate
Rcoh with the responses of a
particular cell. This is compared with the expected coherence rate
Rexp of the same cell. If
Rcoh is systematically smaller than
Rexp for a particular class of
ganglion cells, the model needs to be amended. Amendments are then
made, and they are accepted if they bring
Rcoh (after maximizing again) closer to Rexp. The type of amendments needed
can often be inferred from a comparison of expected and model coherence
functions, and of the response r and the constructed
response r' (Fig. 5), but much of the model optimization is
a process of trial and error.
Figure 6 illustrates for an MC on-center
cell how increasingly complex models approach the expected coherence
function (thick line, Rexp = 55 bits/sec). Responses r' constructed with these models
are shown in Figure 2, with the same low-pass filter used to derive
local spike rates. Model m1 is a
straightforward linear model (i.e., the Wiener filter alone), and its
coherence falls far short of the expected value (Fig. 6); the
corresponding coherence rate, Rcoh,
was 8.5 bits/sec. The first problem of a linear model is that it
ignores the rectification of the signal, which is marked in MC cells.
Model m2 is an attempt to take this into
account. It consists of a low-pass filter (Fig.
7A,
LP1), a high-pass filter (as in Fig.
7A, with q fitted to a fixed value), a
compressive nonlinearity (Fig. 7A,
NL2), and a rectification. Although this model performs much better than m1 (Fig.
6), Rcoh = 31 bits/sec is still
appreciably smaller than Rexp (55 bits/sec). As trace m2 in
Figure 2 shows, the responses to large "on" transients in the
stimulus are now well accounted for, but small transients are missed.
There are two mechanisms that repair this deficiency. First, a
luminance gain control module helps to enhance response to small
luminance variations embedded in regions where the average luminance is
low. Adding the luminance gain control shown in Figure 7A
increases Rcoh to 35 bits/sec for this
cell. Second, model m2 does not
saturate at high contrasts, i.e., it lacks a contrast gain control
module. The most satisfactory model found so far, which includes a
contrast gain control module, m3, is shown
in Figure 7A and gave a
Rcoh = 42 bits/sec. Although this is
still smaller than Rexp, it accounts
for approximately three-quarters of
Rexp in this particular cell.

View larger version (30K):
[in this window]
[in a new window]
|
Figure 6.
Expected coherence for an on-center MC cell and
the coherence calculated for three different models. Model
m3 is depicted in
Figure 7A; parameters for this cell were
1 = 6.6 msec, 2 = 49 msec,
k1 = 1.3 ·
10 2 td 1/2,
c1 = 8.7 · 10 3,
+ = 8.7 msec, q0 = 0.51, q1 = 10.4, c2 = 2.7 · 10 4,
3 = 83 msec, and k2 = 1.4. The inset shows the impulse response of the
Wiener filter following the nonlinear part of the model (as in
Fig. 5); the horizontal bar shows the zero level and a
time scale of 50 msec.
|
|

View larger version (43K):
[in this window]
[in a new window]
|
Figure 7.
Models for retinal ganglion cells.
A, Model for the MC cells.
Ii is the retinal
illuminance; LP1 is a
low-pass filter consisting of a cascade of three first-order filters,
each with time constant 1;
LP2 is a first-order
low-pass filter with time constant 2;
NL1 is a nonlinearity
of the form output = (2/ )atan(k1 · input);
LPq is a low-pass
filter of a form that makes the entire feedforward loop behave as a
high-pass filter with a frequency-domain slope q (Snippe
et al., 2000 ), where q is given by the output of
LP3; LP+
has time constant +;
NL+ is output = (1 + rec+(input)/c1)2,
with rec+ an operator that half-wave rectifies, retaining
only the positive values of its input; (...)
squares its half-wave rectified input, retaining positive signals for
on-center MC cells and negative signals for off-center MC cells, in
both cases related to positive luminance changes;
NLc is output = q0 + q1 · input/(c2 + input); LP3 has a
time constant 3;
NL2 is output = (2/ )atan(k2 · input).
B, Model for +L-M and M+L PC cells; models for -L+M
and +M-L cells are given by interchanging the
IL and
IM at the input.
IL and
IM are the effective L-
and M-cone illuminances, for the CTSIs equal to l and
m (see Materials and Methods);
LP1,
LP2,
NL1, and
NL2 are as defined at
A; LPq
is similar to LPq
at A, with q not variable.
C, Model for the +S-ML cell.
Is is the S-cone
illuminance, for the CTSIs equal to s;
LP1,
LP2, and
NL1 as defined at
A; LPq1
and LPq2
are similar to the LPq
at A, with q1
and q2 not variable;
LP3 has time constant
3;
NL2 is output = g1· atan(g2 · input);
NL3 is output = (2/ )atan(k2 · input).
|
|
It should be noted that Figure 7A shows only that part of
the MC cell model preceding the Wiener filter (as in Fig. 5). The inset in Figure 6 shows the impulse response of the Wiener
filter of this cell with model m3, with
the horizontal line in front designating the zero level, and
a time scale of 50 msec. The fact that the Wiener filter is here
essentially a simple low-pass filter suggests that the model itself
incorporates most of the required filtering (both linear and
nonlinear). For example, the biphasic impulse responses of MC cells
(Lee et al., 1994 ) are produced mainly by the high-pass filter in the model.
Models of retinal ganglion cells
We first developed models for all ganglion cell types from which
we recorded. The model for the MC cells was represented in Figure
7A (a sign change half-way into the model provides a signal inversion for off-center cells). The model derives primarily from results from the literature. It assumed that MC cells receive summed
input from L- and M-cones in a ratio of 1.6:1. The model consists of an
initial luminance gain control (Lankheet et al., 1993 ; Snippe et al.,
2000 ; Smith et al., 2001 ), followed by a compressive nonlinearity.
These may represent outer retinal mechanisms. There follows a high-pass
filter. The high-pass filter is implemented here as having a power-law
slope (with power q) of its transfer function [see Snippe
et al. (2000) for a discussion of this type of filter]. The model
required a fast and a slow contrast gain control. The fast one (the
inner loop) is a divisive feedback of positive peaks in the response,
essentially making peaks sharper and reduced in area. The nonlinearity
(NL+) is expansive, which means that large
peaks are affected more strongly than small peaks. We found that adding
a similar control on negative-going signals did not change
Rcoh, and therefore we omitted it. In
principle, this element resembles the contrast gain control mechanism
described for cat ganglion cells (Victor, 1987 ). The slow contrast gain control (the outer loop) controls, through a nonlinearity and a
low-pass filter, the slope (q) of the high-pass filter. The input module of this loop, (...) , uses only
signals related to increases in the luminance of the stimulus, i.e.,
positive signals for on-center MC cells, and only negative signals for
off-center MC cells. Note that the gain control at the front end of the
model retains some dependence on luminance in its output [it falls
short of Weber's law (Smith et al., 2001 )]. This also applies to the
other modules leading to the input of the outer control loop.
Therefore, this loop may relate to inner retinal gain controls that
modify the time course of MC cell responses as a function of luminance
(Lee et al., 1994 ). Finally, the model contains a compressive
nonlinearity and a rectification. We found that none of the modules in
Figure 7A can be omitted; all contribute significantly to
Rcoh.
We expanded the MC model to contain separate luminance gain controls
for the L- and M-pathways, which were then added with a weighting
wL for the L-signal and
(1-wL) for the M-signal. This revealed
that the weighting wL varied
substantially from cell to cell (ranging from 0 to 1; mean ± SD
was 0.67 ± 0.28), as reported elsewhere (Valberg et al., 1992 ).
However, this increased Rcoh only
marginally (by ~1.5%). This shows that it is justified to treat MC
cells as luminance-driven cells, at least for naturalistic stimuli, and
that information processing by MC cells appears mostly independent of
whether they derive their main input from L- or M-cones.
We tried several other models or functional modules published in the
literature (Victor, 1987 ; Wilson, 1997 ), but none performed as well as
the model in Figure 7A. However, our purpose was not to
compare candidate models but to derive a relatively simple model that
captures the responses of ganglion cells to our stimuli, such that we
can use these models for analysis of visual coding by these neurons. It
should thus be considered as a descriptive approach, which does not
claim to precisely represent the underlying physiology. However, the
luminance and contrast gain control modules closely resemble
suggestions in the literature.
The model for the PC cells (Fig. 7B) contains initial
separate gain controls and compressive nonlinearities for the L- and M-cone pathways. These may again correspond to outer retinal
mechanisms. It was then necessary to provide a low-pass-filtered
luminance signal subtracting from the L- and M-cone pathways
(consistent with producing a power-law high-pass filter). After
subtraction of cone signals (i.e., the cone opponent stage), a
compressive nonlinearity, an offset, and a rectification complete the
model. Note that no further gain controls are necessary for this cell type, which is consistent with other data from the literature (Benardete et al., 1992 ; Yeh et al., 1995 ).
The model for the S-cone excitatory cell proved to be the least
successful of those developed here. One of the problems is a slow
adaptation phenomenon, in which after prolonged absence of
short-wavelength components in the stimulus the cell does not immediately respond when they reappear, but only after a variable delay. We modeled this as a variable threshold (Fig. 7C),
with a slow filter LP3. The top
pathway in Figure 7C is a +S-cone pathway. The
bottom pathway is a long-wavelength opponent pathway
(L+M).
Expected and model coherence rates of retinal ganglion cells
Expected coherence rates and model coherence rates were evaluated
for several models and all ganglion cells for which there was
sufficient data; fits were made separately for each individual cell.
The results are shown in Table 1 for both
CTSIs used. The results show that, as remarked above,
Rexp is larger for MC cells than for
PC cells, with +S-ML cells lying in between. The flower show stimulus
gives higher coherence rates than the laboratory environment (see
Discussion). As shown in Table 1 for MC on-center, MC off-center, and
+L-M on-center cells (similar results were obtained for the other cell
types), purely linear models (m1) that add cone
signals do not work well. Model m2 for the
+L-M cell is a linear opponent model; it performs better than
m1, but not as well as the full model
(m3) of Figure 7B. The best models we
found capture 60-70% of the expected coherence rate of MC cells, 80-90% for PC cells, and ~50% for +S-ML cells.
Coherence and models of generic cells
Responses of an individual neuron to the same stimulus are
variable (Figs. 2, 3). The responses of different neurons of the same
class show further variability. Figure 8
shows responses of five different on-center MC cells to the same
stimulus. There are differences that exceed the variability of the
responses of an individual neuron. Thus, for a uniform field, the
information delivered to the cortex by the array of on-center MC cells,
for instance, is slightly different for each cell of the array, even for cells of similar eccentricity (as was the case here).

View larger version (33K):
[in this window]
[in a new window]
|
Figure 8.
Examples of responses of different on-center MC
cells. The five top traces show the local spike rate of
five different cells, in response to the same section of the stimulus
from the laboratory environment as in Figure 2. Bottom
traces show average and model prediction. The dashes
above zero (0) show the zero level of the latter two. Model
parameters were 1 = 8.0 msec, 2 = 85 msec, k1 = 8.6 · 10 3
td 1/2, c1 = 1.3 · 10 2,
+ = 8.8 msec, q0 = 0.52, q1 = 12.5, c2 = 3.5 · 10 4,
3 = 283 msec, and
k2 = 1.1.
|
|
There are two possibilities as to how the cortex might deal with this
variability. Either it knows (or learns) the temporal characteristics
of each individual neuron and uses all information in the signal of
each cell, or it considers the variability between neurons as a source
of (structural) noise that should be neglected. Then, it should base
its analysis on the characteristics that all neurons of a particular
class have in common. Although the first possibility was implicit in
the above attempt to develop a model that optimally described
individual neurons, we now analyze the second possibility. It leads to
the concept of a generic neuron, which represents its class of neurons,
and produces a response around which the responses of individual
neurons are distributed. We will study these generic neurons in the
simplest way possible by treating the responses coming from different
neurons (of one class) as if they were generated by a single generic
neuron. We can then use the same methods and calculate the expected
coherence rate, now of the generic neuron, and evaluate the coherence
rate of the various models describing the generic cell.
Figure 9 shows the expected coherence of
a group of responses obtained from different on-center MC cells. The
coherence between measurements and model response [(Fig.
7A, light trace) with m3 the same model as used for the on-center MC cell above] is close to
the expected coherence. The coherence rates in this example are
Rexp = 21 bits/sec and
Rcoh = 20 bits/sec. The remaining
discrepancy is at frequencies in the range 0-10 Hz, but it is small.
The inset again shows the Wiener filter following the
nonlinear model.

View larger version (20K):
[in this window]
[in a new window]
|
Figure 9.
Expected coherence and predicted coherence for the
generic on-center MC cell (obtained from the same 5 cells as in Fig.
8). Parameters are as in Figure 8. The inset shows the
impulse response of the Wiener filter, as in Figure 6.
|
|
We performed an analysis of generic neurons for all cell classes;
the results are given in Table 2. Table 2
shows that again Rexp is larger
for MC cells than for PC cells, and there are again higher coherence
rates for the flower show than for the laboratory environment.
Because of the additional intercell variability, all rates are lower
than the result for individual neurons in Table 1. Models typically
capture ~90% of the expected coherence rates.
In the above analysis we pooled all recorded neurons from a particular
class, regardless of whether they were measured in the same animal,
although in principle, intercell variability may be smaller within an
animal than between animals. We therefore compared interanimal and
intra-animal variability in coherence rates. Interanimal variability
was slightly larger than the intra-animal variability, but the
difference was small compared with the overall reduction in coherence
rate in generic cells.
Behavior of compound cells
In an abstract sense, the retina can be considered as a device
that transforms the stimulus into different representations. We wished
to analyze what these representations encode. To simplify notation, we
use the following abbreviations: Mon for the
on-center MC cell, Moff for the off-center MC
cell, Ron for the +L-M on-center PC cell,
Roff for the L+M off-center PC cell,
Gon for the +M-L on-center PC cell,
Goff for the M+L off-center PC cell, and
Bon for the +S-ML cell. For the generic models
developed in the previous section the notation is, e.g.,
on, where the circumflex indicates that we are dealing with the output of a generic model.
As a first analysis step, we combine on- and off-cells into compound
cells by subtracting measured responses (i.e., spike trains) of on- and
off-center cells belonging to a corresponding class. For example, the
Mo compound cell is defined as
Mo = Mon Moff . Similarly, we define Ro = Ron Roff and
Go = Gon Goff . Because measurements on the S+LM cell are
lacking, for the short-wavelength pathway we used
Bon. We can define the analogs for the generic
models. Thus o = on off,
and so on.
By combining measurements |