Previous Article | Next Article 
Volume 17, Number 3,
Issue of February 1, 1997
pp. 1147-1167
Copyright ©1997 Society for Neuroscience
Song- and Order-Selective Neurons in the Songbird Anterior
Forebrain and their Emergence during Vocal Development
Allison J. Doupe
Keck Center for Integrative Neuroscience, Departments of Psychiatry
and Physiology, University of California, San Francisco, San Francisco,
California 94143-0444
ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
FOOTNOTES
REFERENCES
ABSTRACT
Auditory experience is critical for vocal learning in songbirds as
in humans. Therefore, in a search for neural mechanisms for song
learning and recognition, the auditory response properties of neurons
in the anterior forebrain (AF) pathway of the songbird brain were
investigated. This pathway plays an essential but poorly understood
role during the period of song development when auditory feedback is
most crucial.
Single-unit recordings demonstrated that both the lateral magnocellular
nucleus of the anterior neostriatum (LMAN) and Area X (X) contain
auditory neurons in adult male finches. These neurons are strongly
selective for both spectral and temporal properties of song; they
respond more robustly to the bird's own song (BOS) than to songs of
conspecific individuals, and they respond less well to the BOS if it is
played in reverse. In addition, X neurons are more broadly responsive
than LMAN neurons, suggesting that responses to song become
progressively more refined along this pathway.
Both X and LMAN of young male finches early in the process of song
learning (30-45 d old) also contain song-responsive auditory neurons,
but these juvenile neurons lack the song and order selectivity present
in adult birds. The spectral and temporal selectivity of the adult AF
auditory neurons therefore arises during development in neurons that
are initially broadly song-responsive. These neurons provide one of the
clearest examples of experience-dependent acquisition of complex
stimulus selectivity. Moreover, the auditory properties of the AF
circuit suggest that one of its functions may be to mediate the
auditory learning and feedback so essential to song development.
Key words:
auditory response properties;
song learning;
experience-dependent plasticity;
temporal processing;
complex
selectivity;
LMAN;
Area X;
zebra finch;
song system
INTRODUCTION
Birdsong is a complex learned vocal behavior, with
similarities to human speech. Like speech learning, song acquisition
occurs early in a songbird's life and is critically dependent on
auditory experience and feedback (Konishi, 1965
; Marler, 1970
).
Moreover, songbirds possess brain areas specialized for vocal learning
and production (Nottebohm et al., 1976
). This combination of a learned behavior and a discrete neural substrate involved in its control provides an ideal opportunity to study the neural basis of learning. Furthermore, because song is both spectrally and temporally complex, the birdsong system is well suited for examining how the brain learns
to process complex time-varying information.
Song learning consists of two characteristic phases (Fig.
1a). First, during a period of sensory
learning, young birds hear and memorize a parent tutor song or
"template" (Marler, 1970
; Slater et al., 1988
). Later, during
sensorimotor learning, birds sing and gradually refine their song until
it approximates the memorized song template. Young songbirds no longer
need to hear the tutor during this period of vocal practice but do
remain dependent on hearing to match their vocalizations to the
template (Konishi, 1965
). The strong dependence of song learning on
auditory experience and feedback demonstrates that in the songbird
brain, there must exist mechanisms for auditory learning and
recognition and for auditory feedback-guided modification of the vocal
motor pathway.
Fig. 1.
a, The time line of song learning
for zebra finches. b, Schematic of the song system. The
cross-hatched nuclei, HVc, RA, and the tracheosyringeal portion of the
hypoglossal nucleus (nXIIts) form part of the descending motor pathway
for song. The nuclei X, DLM, and LMAN, shown in solid
black, form a pathway indirectly connecting HVc to the RA and
play a special role during song learning. The primary sources of
auditory input to the song system are the field L complex
(L) and its projections to the "shelf" underlying HVc (stippled areas).
[View Larger Version of this Image (23K GIF file)]
One likely location for the neural mechanisms underlying learning is
the song system, a set of brain nuclei found only in birds that learn
to sing using auditory feedback (Fig. 1b) (Nottebohm et al., 1976
;
Kroodsma and Konishi, 1991
). This system is often divided into two
interconnected circuits. The first of these forms the "motor"
pathway for song and consists of a chain of nuclei including HVc (the
acronym is used here as the proper name, as proposed by Fortune and
Margoliash, 1995
) and the robust nucleus of the archistriatum (RA)
(Nottebohm et al., 1976
; McCasland, 1987
). The motor pathway must be
intact at all ages for song to be produced normally (Nottebohm et al.,
1976
).
In contrast, a second circuit of brain nuclei is essential for normal
song production only during song learning and modification. This
pathway indirectly connects HVc to RA via the anterior forebrain (AF)
and consists of Area X (X), the medial portion of the dorsolateral nucleus of the thalamus (DLM), and the lateral portion of the magnocellular nucleus of the anterior neostriatum (LMAN) (Okuhata and
Saito, 1987
; Bottjer at al., 1989) (Fig. 1b). Unlike
disruptions of the motor pathway, interruptions of this AF circuit do
not affect stable adult song production (Nottebohm et al., 1976
;
Morrison and Nottebohm, 1993
). Lesions of the AF pathway in animals
learning their vocalizations, however, cause highly abnormal song
(Bottjer et al., 1984
; Nottebohm et al., 1990
; Sohrabji et al., 1990
;
Scharff and Nottebohm, 1991
).
One possible function of the AF circuit is to provide the auditory
input essential to normal song development. By this hypothesis, this
circuit should contain sensory neurons responsive to sounds. In adult
songbirds, this pathway does contain auditory neurons, which are
selective for the bird's own song (BOS) (Doupe and Konishi, 1991
),
like the neurons found in the song sensorimotor nucleus HVc
(Margoliash, 1983
, 1986
; Margoliash and Fortune, 1992
; Margoliash et
al., 1994
; Sutter and Margoliash, 1994
; Lewicki and Konishi, 1995
;
Lewicki and Arthur, 1996
; Volman, 1996
). The present work examines the
auditory selectivity of neurons in two of the AF nuclei, LMAN and X, in
anesthetized adult finches. These temporally and spectrally selective
neurons are among the most complex sensory neurons known. The same
nuclei were then investigated in juvenile birds, early in the process
of song learning, at a time when the AF circuit is essential for song
acquisition. The results demonstrate that this circuit is auditory in
young birds, consistent with a sensory role of this pathway in
learning. The properties of the juvenile neurons are dramatically
different from those in adult birds; however, they are not song- or
order-selective. The complex selectivity of adult AF neurons must
therefore emerge during development, in parallel with vocal
learning.
MATERIALS AND METHODS
Experiments with adult birds (>100 d old) were conducted
primarily with male zebra finches (Taeniopygia guttata)
obtained from local breeders or raised in our colony. A small number of recordings in adult X were also obtained from male Bengalese finches (Lonchura striata); these were included in the overall
analysis, because a statistical comparison of selectivity indices (SI)
(see below) showed no differences between the two species. Juvenile male zebra finches were bred and raised in our colony in sound attenuation chambers (IAC), where they were exposed to a single tutor,
the male parent. Juvenile finches ranged in age from 29 to 45 d
posthatch (in X, 13 birds ages 29-32 d; 14 ages 33-36 d; 17 ages
37-40 d; 16 ages 41-45 d; in LMAN, 7 birds ages 37-40 d; 10 ages
41-45 d). Young zebra finches normally begin singing at approximately
day 25-40 (Immelmann, 1969
; Arnold, 1975
). Whether any of the juvenile
birds studied here were actually already singing was not systematically
determined; given the ages of the oldest birds, it was clearly
possible.
Electrophysiological recordings and auditory stimuli. Before
each experiment, the BOS or the parent tutor song (TUT) was recorded on
analog tape and digitized at 20 kHz with 12 bit resolution with the aid
of either a PDP-11/40 (Digital, Boston, MA), a Masscomp 5600 (Concurrent, Westford, MA), or a Sparc IPX (Sun Microsystems, Mountain
View, CA) computer (with software written by Daniel Margoliash, Larry
Proctor, and Michael Lewicki, California Institute of Technology). The
song was then stored on computer disk, along with a library of songs of
other zebra finch individuals, to be used for playback during the
physiology experiments. Songs were also reversed and edited on the
computer.
At least 1 d before the experiment, birds were anesthetized with
Equithesin (2-4 ml/kg, i.m.; 0.85 gm of chloral hydrate, 0.21 gm of
pentobarbital, 0.42 gm of MgSO4, 2.2 ml of 100% ethanol, 8.6 ml of propylene glycol to total volume of 20 ml with
H20; chemicals from Sigma, St. Louis, MO) and placed in a
stereotaxic head holder. A stainless steel post was then cemented to
the skull in a fixed location centered on the midsagittal sinus. This
stereotaxic post served to immobilize the head during the recording
sessions and to provide a fixed point from which to measure the
location of various song nuclei. On the day of the experiment, birds
were anesthetized with 20% urethane (Sigma, 65-90 µl, i.m.,
delivered as 3-4 injections of 10-25 µl each, at 30 min intervals).
Unlike barbiturates and ketamine (Vicario and Yohay, 1993
), urethane does not appear to affect the stimulus selectivity of high-order neurons. Glass-coated platinum-iridium microelectrodes were used to
make stable single-unit extracellular recordings of neuronal responses
to a variety of acoustic stimuli played back from the computer. These
stimuli were presented by a small calibrated speaker (JBL, Northridge,
CA) 1.7 m in front of the bird inside a sound attenuation chamber
lined with foam; the spectrum of the sound presentation system,
measured with a calibrated microphone placed at the location of the
bird, was flat ± 6 dB from 500 Hz to 10 kHz. The sound stimuli,
the peak amplitude of which was 70 dB sound pressure level (SPL),
included broad-band noise bursts, tone bursts from 500 Hz to 6 kHz,
usually presented in 1 kHz increments, the BOS and/or TUT (including in
reversed and edited versions), and the songs of other zebra or
Bengalese finches [conspecific (CON) songs ]. Songs were
approximately matched for overall intensity as well as peak amplitude;
in a small number of cases, neurons were tested for sensitivity to
sound intensity by varying the intensity of the BOS over a range of
5-10 dB SPL and showed no strong intensity dependence in that range.
Estrildid finch song usually contains 4 to 10 syllables, defined as
continuous acoustical signals separated from surrounding syllables by a
fall in the amplitude to near zero and often by a silent interval;
syllables are composed of one or more notes (continuous signal without
abrupt frequency transitions). Strings of syllables are delivered in a
fixed sequence known as a "motif" or "phrase"; a "strophe"
or "bout" of adult finch song consists of a series of introductory notes followed by one or more repeats of the motif (Sossinka and Boehner, 1980
). In some experiments, individual syllables or series of
syllables were also played to the bird to investigate the basis of a
song response. Search stimuli always included the BOS or TUT and, in
most cases, a simple stimulus such as a broad-band noise burst. Stimuli
were played with an interstimulus interval of 8-10 sec to prevent
habituation, and collections consisted of 10-25 trials; in the later
experiments (including all the recordings from juvenile birds and 30%
of the adult units), a minimum of three different stimuli were
delivered interleaved. In these experiments, the interstimulus interval
was also randomly varied from 8-10 sec on each trial to prevent any
possible entrainment of neural activity.
Neural activity was amplified and filtered from 300 Hz to 10 kHz, and
single units were isolated by passage through either a level detector
or a window discriminator (World Precision Instruments, Sarasota, FL);
spike event times were stored in the computer, and single-unit activity
was displayed by the PDP-11/40, Masscomp 5600, or Sparc IPX computers
both as a raster pattern and as a summed peristimulus time histogram of
10 to 25 stimulus presentations.
The percentage of neurons in LMAN and X that were responsive to sound
stimuli was not systematically quantified, although the majority of
neurons that could be isolated appeared to be auditory. The auditory
responsiveness of X and especially LMAN neurons was very sensitive to
the depth of anesthesia and the state of arousal of the animal,
however, so that in some birds (not included here), no auditory
responses of any kind could be observed throughout the experiment, and
in others, auditory responses would disappear after a period of
recording. This was true both when spontaneous rates were unusually
high or were very low. This property of AF neurons, along with the
difficulty in isolating units, and the long time required to
characterize each unit account for the small numbers of single units
(1-11) analyzed per bird.
Electrolytic lesions were placed at the sites of selected units. At the
end of an experiment, animals were given a lethal dose of Equithesin
and fixed with 4% formaldehyde in 0.025 M PBS via
intracardial perfusion. Electrode tracks and electrolytic lesions were
located on 30 µm frozen sections stained with cresyl violet. The
borders of X and of magnocellular LMAN (the core) were clearly
identifiable in these sections. Neuronal data were only included in the
data analysis if the recording site could be unambiguously localized
histologically.
Data analysis. Data were analyzed off-line using software
written by Michael Lewicki and Larry Proctor, California Institute of
Technology, and by Frederic Theunissen and Jim Wright, UCSF. The firing
rate to a given song stimulus was quantified from recorded data as the
average spike rate during the song (spikes/sec) during a window that
was of equal duration to the stimulus but that was delayed relative to
stimulus onset by an amount equal to the unit's response latency. The
minimum response latency (accurate at best to within 5 msec) was
determined by visual inspection of the spike rasters and histograms
when a response to a simple stimulus made it possible. If no latency
could be determined because the unit did not respond to any simple
stimuli, the latency of other LMAN or X units from the same bird was
used. If no units in a bird allowed latency determinations, default
latencies characteristic of that nucleus were used (in adult birds, 35 msec for X and 50 msec for LMAN; in juvenile birds, 70 msec for X and
100 msec for LMAN). Neural responses to song generally began long after
song onset and ended before song offset, so that the specific response latency chosen had little effect on the calculated firing rate. Units
were only considered auditory and included in the data analysis if a
paired t test showed that the spike rate during a stimulus was significantly different (p < 0.05) from the
baseline firing rate, which was determined from 2-4 sec of spontaneous
firing before and after each stimulus. The firing during the time
period (1 sec) immediately after the stimulus was always excluded from the calculation of the spontaneous rate, because many stimuli elicited
significant poststimulus inhibition of the baseline firing.
To compare the evoked responses with different stimuli, a unit's
response strength (RS) to a song stimulus was calculated as the average
spike rate during the song minus the average baseline firing rate for
the same trials (both measures as described above). This gives a
measure of the total amount of firing (above spontaneous rate) elicited
during a song. Although this measure is less sensitive for neurons that
have very brief phasic responses, LMAN and X neuronal responses tend to
be sustained over several syllables and, therefore, this measure
accurately reflects the relative strength of neuronal responses to
different song stimuli. It is an underestimate of the peak firing rate
of these neurons, however, because neuronal firing, although sustained,
does not occur throughout the entire song, but the number of spikes is
still normalized to the entire song. All trials in which a particular
stimulus type was presented to a unit were used to calculate the mean
RS for that unit. For CON songs, the RS values to different individual songs were first calculated and used to determine the number of neurons
significantly responsive to any conspecific stimulus (using a paired
t test, p < 0.05) (Table 1);
song. data from responses of a single unit to all CON were then used to
calculate the mean RS to CON for that unit. The mean RS for a
particular stimulus for all neurons in a nucleus was calculated as the
mean of the mean RS to that stimulus of all units analyzed. Mean RS
values for different stimuli were compared using a one-way ANOVA and Scheffe tests for correction for multiple post hoc
comparisons. The significance of the difference in RS to different
stimulus classes (i.e., the selectivity) was assessed for each unit
with an unpaired t test (p < 0.05).
Another measure used to quantify the responsiveness of neurons for one
stimulus relative to another was the SI. The SI of a neuron for
stimulus A relative to stimulus B was defined as SI(AvsB) = mean RSA/(mean
RSA + mean RSB). SI will
approach 1.0 if stimulus A is much preferred and zero if stimulus B is
preferred, and will be close to 0.5 if the responses to both stimuli
are similar. Because this index becomes highly nonlinear when RS values are negative (i.e., inhibitory), all negative RS values were set to
zero for the calculation of SI, which then has a minimum of zero and a
maximum of 1.0. This adjustment ignores the selectivity difference
between stimuli that evoke no response and those that actually inhibit
neural firing; the numbers of neurons showing significant response
inhibition to any stimulus (p < 0.05 compared with spontaneous rate) are therefore shown separately in Table 1. All
statistics were calculated with the aid of the software package
StatView4.5 (Abacus Concepts, Calabasas, CA).
RESULTS
Selective auditory neurons in adult LMAN
Song selectivity
I recorded a total of 64 single auditory units in the LMAN of
adult male finches (n = 18). Small clusters of units as
well as multiunit recordings showed qualitatively similar auditory responses but were not included in the quantification. The most striking feature of LMAN auditory neurons was their complex stimulus selectivity. These cells were much more responsive to complex acoustic
stimuli than to simple ones such as tone or broad-band noise bursts. In
particular, each BOS was a very effective stimulus for these neurons;
62/64 LMAN neurons responded significantly to the BOS (see Materials
and Methods for definition of significant response). Moreover, 52/53
neurons responded more strongly to the BOS than to songs of
conspecifics (other individuals of the same species, CON).
Representative examples of such song-selective units are shown in
Figures 2 and 3. The neuron in Figure 2
demonstrates the strong but irregular bursting that is characteristic
of responses to the BOS (Fig. 2a). The same neuron responded
much more weakly to a conspecific song, despite the general acoustic
similarity of the two finch songs (Fig. 2b), and was
actually inhibited below its spontaneous rate by another conspecific
song (Fig. 2c). The response of the neuron in Figure 3 was
sustained over several syllables of the BOS, a property typical of many
adult LMAN neurons (Fig. 3a). Thus, although this neuron
also showed a phasic response to some features of a conspecific song
(Fig. 3c), its overall response to the BOS was greater than
that to any of the other stimuli.
Fig. 2.
Auditory responses of a single unit in LMAN of an
adult zebra finch. a, The response to the BOS;
b, c, responses to two different conspecific songs (songs of other individual zebra finches). The strong
response to the BOS is followed by a period (~1.5 sec) of inhibition.
Note that there is considerable trial-to-trial variability in the
reproducibility and exact timing of the stimulus-evoked spikes. The
conspecific song in b elicits a weak response, whereas the song in c inhibits the spontaneous firing of the
neuron. Below each spike raster and peristimulus time histogram are
shown the sonogram (frequency vs time plot, with energy in each
frequency band indicated by the darkness of the signal) and the
oscillogram (amplitude waveform) of the song stimulus used.
[View Larger Version of this Image (18K GIF file)]
Fig. 3.
A single unit in LMAN of an adult zebra finch.
a, Response to the BOS; b,
c, responses to two different conspecific songs; d, response to a 5 kHz tone burst. The conspecific song
in c elicits a phasic response 60 msec after each of two
syllables that contain primarily a loud sound in the 4.5-5 kHz
frequency range (circled). The BOS (a)
also contains a note in that range (circled), and a 5 kHz tone burst elicits a response as well
(d).
[View Larger Version of this Image (31K GIF file)]
To quantify the responses of neurons to complex song stimuli, I
calculated their RS values (stimulus-evoked rate
spontaneous firing; see Materials and Methods) for different stimuli. The mean RS
for the whole population of LMAN neurons studied was much higher for
the BOS than for CON [3.53 spikes/sec ± 0.38 (SEM) vs 0.13 ± 0.12; n = 64 for the BOS, n = 53 for
CON; first two bars in Fig. 4a].
A one-way ANOVA revealed a significant difference in RS values among
all stimuli tested (F(3,167) = 39.88, p < 0.0001; stimuli were the BOS, the CON, and the two
types of reversed song, see below); a subsequent Scheffe test (to
correct for multiple comparisons) confirmed that the RS to the BOS was
significantly greater than the RS to CON (p < 0.0001). This song selectivity is a feature of individual neurons and
not simply a property emerging from the responses of the entire
population of neurons considered as a whole; a plot of RS to the BOS
versus RS to CON for all single neurons for which both stimuli were
tested shows that 52/53 neurons lie to the right of the line that
indicates equal response to both stimuli (Fig. 4b,
open squares); 46/50 individual units analyzed also had a
significantly greater response to the BOS than to CON (p < 0.05, unpaired t test). The
data in Figure 4b also show that song selectivity is neither
found only in neurons with high (or low) RS nor a property found only
in a subset of birds. More than 20% of LMAN neurons not only did not
respond strongly to CON but were also significantly inhibited by at
least one particular conspecific song (Table 1).
Fig. 4.
Summary data for adult LMAN neurons.
a, Histogram of mean of RS values of all LMAN neurons
recorded to the BOS, conspecific song (CON), BOS
reversed (REV), and BOS in reverse order
(RO). Error bars indicates SEM. b, For
each individual LMAN unit for which these stimuli were tested, the RS
values (RS) to the BOS is plotted versus the RS to CON
(open squares) or the RS to REV (solid
diamonds). The dashed line indicates the points
where responses to the stimuli plotted on each axis are equal.
[View Larger Version of this Image (16K GIF file)]
To quantify the relative selectivity of a cell for two different
stimuli, an SI (SI(BOSvsCON)) was also
calculated for each neuron analyzed above (see Materials and Methods);
this index is 1.0 when the BOS is strongly preferred and 0.5 when
responses to the BOS and CON are equal. Table 1 shows that the mean
SI(BOSvsCON) for all LMAN neurons is close to
1.0.
In many cases, because of the complexity of zebra finch song, it is not
straightforward to assess exactly which features of song account for
the selective responses seen. Occasionally, however, as in Figure 3,
a-d, some indication of this is given by a
neuron's response to simpler acoustic stimuli. Figure 3d
shows the phasic response of this neuron to a tone burst of 5 kHz.
Close examination of the conspecific song of Figure 3c
reveals that the two phasic responses occurred after a note in which
most of the energy was in the same 5 kHz frequency range
(circled in each of the two motifs). The BOS also contains a
single note with energy in that range (circled in Fig.
3a), and part of the BOS response occurred at the same time
after that note as for the conspecific and tone stimuli. It is clear,
however, that the neuron was responding to a number of features of the
BOS, including some that occur before that tonal note, overall giving a
more sustained response to the BOS than to any of the other
stimuli.
Unlike song selectivity, however, responsiveness to simple acoustic
stimuli was not a universal feature of LMAN neurons. Of LMAN neurons
fully tested for sensitivity to frequencies in the 1-5 kHz range
(n = 27), only two-thirds responded to a tone burst of
at least one frequency (Table 1), and no units responded to all
frequencies. Although zebra finch songs are most notable for their many
harmonic stacks (combinations of harmonically related frequencies) and
complex noisy syllables, some songs also have syllables containing
predominantly one frequency (see for instance the songs in Figs. 3,
5). A pure tone response in LMAN might be explained by
the presence of such syllables in the BOS; in fact, in 89% of the
cases in which fully characterized neurons were found to respond to
tone bursts, the BOS contained a tonal syllable in the same frequency
range, and in half of those cases, the neurons responded to tone bursts
only in that frequency range. Despite the noisy quality of many zebra
finch syllables, broad-band noise bursts were very rarely effective
stimuli for adult LMAN auditory neurons (Table 1).
Fig. 5.
Temporal response properties of a single unit in
LMAN of an adult zebra finch. The response to the BOS
(a) is much stronger than the response to the same song
reversed (b), which even elicits some inhibition. Both
songs contain a syllable with significant power in the 5-5.2 kHz range
(circled). The same neuron responds to a 5 kHz tone
burst played in isolation (c), whereas a 2.5 and 5 kHz
tone combination elicits no response (d). Histograms of the response to 10 stimulus presentations are shown
above sonograms and oscillograms of the song stimulus;
in c and d, spike rasters are also shown
to demonstrate the long and scattered response latency of LMAN neurons
as well as the variability of the response.
[View Larger Version of this Image (17K GIF file)]
Order selectivity
Temporal order is an important feature of many acoustic stimuli,
and birdsong, like speech, has particularly complex temporal structure.
Therefore, I tested the sensitivity of LMAN neurons to temporal
features of song. I did this in several ways, as follows. (1) I played
the BOS entirely reversed; this completely alters the temporal
structure of the song (both the sequence of syllables as well as the
temporal order within syllables) while maintaining the overall power
spectrum calculated over the whole song. (2) In some cases, I reversed
the order of the syllables while preserving the normal temporal order
within each syllable (e.g., dcba vs abcd); this reversed order (RO)
song disrupts the global order or sequence of syllables while
preserving the local order. In 41/43 single units tested, reversing the
song (RO) dramatically reduced the effectiveness of the BOS as an
acoustic stimulus for LMAN neurons.
A typical example of order selectivity is shown in Figure 5. This
neuron responded well to the BOS, with clusters of spikes before and
especially after the circled syllables, which contain a loud note with
much energy near 5.2 kHz (Fig. 5a). In contrast, the
identical song played in reverse (REV) elicited no response from this
neuron and even slightly inhibited the neuron's spontaneous firing
rate (Fig. 5b). Consistent with the frequency content of this bird's song, this neuron also responded phasically to a 5 kHz
tone burst (Fig. 5c). Nonetheless, the neuron showed no
response to the 5.2 kHz frequency in REV, although the syllable
containing that frequency is little changed by the reversal. This
demonstrates clearly the temporal context dependence of these LMAN
neurons. Although part of the neuron's response to song may be
attributable to the 5.2 kHz tone, the response to the tone depends not
simply on its presence but also on the temporal context in which it
occurs; preceded by the wrong features in the reversed song, the
neuronal response is inhibited. Another aspect of the context
dependence of this neuron is shown in Figure 5c,d; although
the neuron responded robustly to the 5 kHz tone burst alone (Fig. 5c),
the addition of a 2.5 kHz tone to the 5 kHz tone burst completely
eliminated the response (Fig. 5d). This neuron thus requires
both a specific set of frequencies and a specific order.
The mean RS for the whole population of LMAN neurons studied was much
higher for the BOS than for REV (3.53 spikes/sec ± 0.38 vs
0.26 ± 0.16; n = 64 for the BOS,
n = 43 for REV; p < 0.0001, Scheffe
test for multiple comparisons) (Fig. 4a). As was true for
song selectivity, order selectivity was neither simply a property of
the population of neurons as a whole nor of individual birds. A plot of
the RS to the BOS versus RS to REV for all single units tested for both
stimuli shows that 41/43 neurons lie to the right of the line that
indicates equal response to both stimuli (Fig. 4b,
solid diamonds); 35/42 individual units analyzed also met a
statistical criterion for greater response to the BOS than to REV
(p < 0.05). In addition, many neurons were
significantly inhibited by presentation of REV (Table 1).
To assess the relative importance of local temporal structure (within
note timing) versus global temporal structure (note sequence), I tested
some LMAN neurons with reverse order (RO) song (n = 12). This manipulation markedly reduced the responses of LMAN neurons
to the BOS, although, in many cases, not as completely as the full
reversal of song. This is seen in Figure 6,
a-d. This LMAN neuron responded well throughout
much of the first repetition of the motif of the BOS (Fig.
6a). Fully reversing the song completely eliminated the
response (Fig. 6b), whereas reversing the order of syllables
left a phasic response in each motif (Fig. 6c). Examination of this phasic response reveals that it occurred with ~60 msec latency after each onset of the circled syllable "a," which
contains a harmonic stack with downsweeping frequencies. This syllable also elicited a response when played by itself (Fig. 6d,
left), consistent with the idea that the response to this
portion of the song was less context-dependent than the rest of the
neuron's responses to the BOS. Songs in reverse order often revealed
which syllables of song elicited context-independent responses. When syllable "a" was reversed (which changes the direction of the frequency-modulated sweep), it elicited no significant response (Fig.
6d, right). Some of the response to the BOS
corresponded to the location of the circled syllable "a," but much
of it was before or after (Fig. 6a), showing (as in Fig. 5)
that LMAN neurons respond to a number of features of song, which occur
over hundreds of milliseconds of time and in a particular order.
Fig. 6.
Responses of adult LMAN neurons to different
temporal manipulations of the BOS. This single unit responds strongly
to the BOS, especially the first motif (a), and very
little to the BOS reversed (b). The introductory notes
(i) and syllables of each of the two motifs of song are
labeled with lower case letters. c, The
BOS played in reverse order, which maintains the order within each
syllable while reversing the sequence, elicits a phasic response after
each occurrence of syllable a. d, Syllable a also elicits a response when played in isolation (d,
left panel, shown on an expanded time base); reversing
this syllable eliminates that response (d, right
panel).
[View Larger Version of this Image (40K GIF file)]
As expected from the RS of individual neurons, the mean RS to RO for
all LMAN neurons tested was intermediate between the BOS and REV (1.06 spikes/sec ± 0.32 for RO vs 3.53 ± 0.384 for the BOS and
0.26 ± 0.16 for REV; RO < BOS, p < 0.005, Scheffe test; Fig. 4a). The partial response to
features of reverse order songs is consistent with the idea that both
local note structure and note sequence contribute to the temporal
selectivity of adult LMAN neurons.
Combination sensitivity
The properties of the single units in LMAN strongly suggested that
they respond to combinations of features of the song. Combination sensitivity is a nonlinear response property of neurons in which the
response to a combination of features (e.g., syllables abc) is greater
than the simple sum of the responses to each of those features
presented alone (a + b + c). Although this was not the main focus of
these experiments, in a number of cases (n = 8 units in
4 different birds), I demonstrated combination sensitivity in LMAN
neurons by measuring responses to subsets of syllables of the song as
well as the response to the whole song. An example of this is shown in
Figure 7. This single LMAN unit responded robustly to
the BOS, with the bulk of the response centered over the middle portion
of the song (Fig. 7a, syllables e-h). Presentation of the
first four syllables alone (Fig. 7c, syllables a-d)
elicited little response from the neuron (2.4 spikes/sec); the
following two syllables in isolation (Fig. 7d, syllables
e-f) also elicited only a weak response (3.2 spikes/sec). In
combination, however, these two stimuli (syllables a-f) elicited a
strong response from the neuron (Fig. 7e; 17.0 spikes/sec)
that not only exceeded the sum of responses to stimuli a-d and e-f,
but was as strong as the response to the whole song (Fig.
7a; 12.0 spikes/sec; the RS to the whole song is less than
the RS to a-f, because a very similar overall response is normalized
to a longer total stimulus duration). The RS to each of the syllable
combinations presented are shown in Figure 7b; the
dashed white lines represent the linear sum of the RS to the
component syllables. Thus, this unit showed a strongly nonlinear
combination sensitivity. Syllables a-f do not
form the only combination to which the neuron responded nonlinearly, however. A later portion of the same motif (syllables g-i) elicited a
small response primarily during syllable i (Fig. 7f).
Presentation of syllables e-f (ineffective alone, Fig. 7d)
in combination with syllables g-j, however, elicited a much enhanced
response, primarily during syllables g-h (Fig. 7g). Thus,
there is a nonlinear response not only to syllables e-f in combination
with a-d, but also to the syllables e-f combined with syllables g-i.
This dissection of song reveals that there are multiple combinations of
sounds contributing to the overall song selectivity of these very
complex neurons.
Fig. 7.
A combination-sensitive neuron in adult LMAN.
a, The response of the neuron to the entire BOS, the
syllables of which are indicated with lower case letters
below the oscillogram, is shown. b, The mean RS (error
bars indicate SEM) to each of the indicated syllable combinations is
shown. The dashed white lines on the two outermost
bars indicate the linear sum of the responses to the
syllable combinations that are the components of that stimulus. c-g, The neuron's response to the
indicated combinations of syllables (additional description in Results)
is shown. Note that the RS to e-j in comparison with the linear sum of
the RS to e-f and ghi does not seem as strongly nonlinear as the a-f
combinations. This is attributable in part to a different temporal
pattern of responses (see Results) and in part to the underestimate of
firing rate caused by normalizing RS to the entire syllable combination played (which is longer for e-j than for ghi and, thus, underestimates the enhanced response; f vs g).
[View Larger Version of this Image (23K GIF file)]
The particular combinations of syllables presented to a neuron also
influence the temporal pattern of the response. For instance, the
combination g-i (Fig. 7f) elicited a weak response
with a latency of ~300 msec after the onset of syllable g; inclusion of the preceding syllables e-f as well, however, in the combination e-i, revealed a response with a latency of 60 msec after the onset of
syllable g (Fig. 7g). Thus, a new response, with a shorter latency, emerged when e-f was added to g-i, and the subsequent response during syllable i was diminished. Another example of this
shift in response is seen if one compares the response to syllables
g-h within the whole song with the response to these syllables within
the shorter combination e-j; the response to g-h in the combination
e-j alone (Fig. 7g) was much greater than the response to
g-h when preceded by the entire a-f sequence, which elicited a burst
of firing at syllables e-f (Fig. 7a). Similarly, in Figure
6, the response elicited by syllable a in isolation or in the reverse
order song (Fig. 6d,c) is greater than the
response elicited by a in the forward song, when this sound is preceded by other syllables that also elicit responses (Fig. 6a). The
tendency for LMAN neurons both to respond nonlinearly to multiple
combinations of features in the BOS and to have decreased
responsiveness after a burst of firing explains why the temporal
pattern of response to a series of syllables is not always predictable
from the responses to subsets of these syllables.
General properties of LMAN neurons
In addition to their prominent selectivity, adult LMAN neurons had
several other characteristic features. (1) The mean spontaneous firing
rates of these cells were very low (2-3 spikes/sec on average, but
ranging from 0.2 to 10.0 spikes/sec; Table 1). (2) LMAN neurons showed
irregular bursting to auditory stimuli (e.g., Figs. 2a, 5c). As seen in the spike raster in Figure 2a,
the unit's firing, although clearly responsive to the BOS or parts
thereof, was not tightly stimulus-locked and did not occur reproducibly
on every trial. Similarly, the raster plot in Figure 5c
shows the usual irregular responses of LMAN neurons, and the scattered
response latency (of ~55 msec) to this tone burst. (3) LMAN neurons
tended to show poststimulus inhibition of the spontaneous rate after an
effective stimulus (e.g., Figs. 2a, 3a,
7a). (4) The second repetition of a finch song phrase
(motif) did not necessarily elicit as robust a response as the first
appearance of the same syllables, suggesting habituation of the
neuronal response (Figs. 2a, 6a). This tendency
to habituate was evident with overall song repetition as well; stimuli
were always presented at 8-10 sec intervals, because interstimulus
intervals much shorter than that led to a decreased overall
response.
A striking feature of LMAN was that auditory neuronal responses to the
BOS were similar throughout the nucleus within each bird. A comparison
of neurons from the individual birds revealed that the major response
to the BOS tended to occur during the same syllables for each neuron
(examples are shown in Fig. 8a,b). Moreover, if a particular conspecific song or tone burst elicited a
phasic response, it tended to do so for many of the auditory neurons
encountered in that LMAN. For instance, if a single tone burst elicited
a response from an LMAN neuron, the same tone burst was effective for
87.5% of all neurons tested within the nucleus (data from 20 tone-responsive neurons from 4 birds in which at least 4 neurons were
tested for 4 or more frequencies from 1 to 5 kHz). The tendency for all
LMAN neurons recorded in a single bird to respond to similar features
of the BOS was particularly evident when combination sensitivity was
analyzed. For instance, four other units in the LMAN from which the
unit in Figure 7 was recorded showed essentially the same requirements
for syllable combinations (Fig. 8c). These five units
spanned the entire dorso-ventral extent of anterior LMAN. Thus,
although it was not systematically mapped, there was also no strong
correlation of response type with location within the nucleus.
Fig. 8.
a, The similar (although not
identical) pattern of responses to the BOS from three different single
units in adult LMAN of one bird is shown. b, Another
example of three LMAN units from a different bird is shown.
c, The mean RS values to each of the syllable
combinations for 5 different single LMAN units within the same bird are
shown (for 1 unit, combination a-d was not tested). The unit shown in
detail in Figure 7 is represented by the solid dots.
[View Larger Version of this Image (19K GIF file)]
Song-selective neurons in adult X
Song selectivity and order selectivity
The song selectivity of adult LMAN neurons could emerge in LMAN or
might simply be a reflection of song selectivity present in neurons
earlier in the auditory pathway (for instance, the song-selective
neurons known to exist in HVc) (McCasland and Konishi, 1981
;
Margoliash, 1983
, 1986
; Margoliash and Fortune, 1992
; Lewicki and
Konishi, 1995
; Lewicki and Arthur, 1996
; Volman, 1996
). To address this
question, I investigated the auditory response properties of the
forebrain song nucleus X, which provides the major input to LMAN via
the intervening thalamic nucleus DLM. Like the units in LMAN, the
majority of X neurons responded strongly to the BOS and exhibited song
and order selectivity (Table 1, Figs. 9, 10). This was
evident from recordings of 37 single units in X of adult finches (from
11 birds, 1-8 units/bird), as well as from numerous small clusters of
units, which had similar properties but were not included in the
quantitative analysis.
Fig. 9.
Auditory responses of a single unit in X. The
response to the BOS (a), to the BOS reversed
(b), and to two different conspecific songs
(c, d). Note that the stimuli in
b and especially c elicit responses,
including poststimulus inhibition (c), but that these responses are lower in magnitude and less sustained than the response to the BOS.
[View Larger Version of this Image (51K GIF file)]
Fig. 10.
Responses of a rapidly firing single unit in X. The response to the BOS (a) is stronger than the
response to the BOS in reverse order song (b), BOS
reversed (c), and conspecific song (d).
The mean spontaneous firing rate of the neuron during each set of trials is indicated by the dashed white line on each
histogram.
[View Larger Version of this Image (41K GIF file)]
Examples of two typical X units are shown in Figures 9 and
10. X units have higher baseline firing rates than LMAN
neurons (Table 1). Figure 9a-d shows a single X
unit with a spontaneous rate of 6.9 ± 0.30 spikes/sec. Like LMAN
units, this unit responded best to the BOS (Fig. 9a) and
much less to REV (Fig. 9b) or CON (Fig.
9c,d). Figure 10 shows another X unit, with a
very high background firing rate (46.33 ± 0.58 spikes/sec). This
high firing rate was not associated with a loss of selectivity; the
neuron responded better to the BOS (Fig. 10a) than to the
same song either in reverse order (Fig. 10b) or completely
reversed (Fig. 10c). Similarly, the unit showed very little
response to conspecific song (Fig. 10d).
As in LMAN, these song-selective properties were present for virtually
all of the auditory X neurons that I recorded; 35/37 single X units
responded significantly to the BOS, and a one-way ANOVA revealed
significant differences between the mean RS of X neurons to the song
stimuli tested (F(3,96) = 7.267, p < 0.0003). Subsequent Scheffe tests confirmed that
the RS to the BOS was significantly greater than the RS to REV or CON
(Fig. 11a) (6.49 spikes/sec ± 0.91 for
BOS vs 1.94 ± 0.49 for REV, p < 0.002, and vs
2.80 ± 0.53 for CON, p < 0.007;
n = 38 for BOS, n = 22 for REV,
n = 28 for CON). The mean RS of X neurons to the BOS
was also greater than their RS to RO (6.49 ± 0.91 to BOS vs 4.77 spikes/sec ± 1.03 to RO; n = 12 for RO) (Fig.
11a), but this difference did not reach statistical
significance (although this lack of significance may be attributable to
the much smaller number of units tested with RO). The song and order
selectivity of X could be measured at the level of single cells and was
not a result simply emerging from the properties of the entire
population of X neurons considered in concert. A comparison of the RS
of individual X neurons (Fig. 11b) shows that a majority of
X neurons were song- and order-selective (i.e., lie below the line that
indicates equal response to forward and reversed or conspecific
stimuli); 20/28 individual neurons responded significantly more
(p < 0.05) to BOS than to CON and 12/21 more to
the BOS than to REV. Like the selectivity of LMAN neurons, the
selectivity of X neurons was neither a property found only in neurons
with high (or low) mean response rates nor a characteristic of neurons
in only a subset of birds.
Fig. 11.
Summary data for adult X neurons.
a, Histogram of mean RS of all X neurons to the BOS,
CON, REV, and RO. Error bars indicate SEM. b, For each
individual X unit for which these stimuli were tested, the RS to the
BOS is plotted versus RS to CON (open squares) or RS to
REV (solid diamonds). Stippled squares
and open diamonds represent data from Bengalese finches.
The dashed line indicates the points where responses to
the stimuli plotted on each axis are equal.
[View Larger Version of this Image (19K GIF file)]
There were no striking difference among the response properties of
different X neurons within the same bird, suggesting a uniform
population of neurons without clear topographic selectivity differences, as was true in LMAN. Moreover, although X neurons were
less selective, the pattern of their responses to stimuli was similar
to that of LMAN neurons in the same bird. For instance, if a particular
conspecific song elicited a phasic response in LMAN, the same song
tended to elicit a response in X.
Differences between adult X and LMAN
Although both X and LMAN neurons were song- and order-selective,
there were several differences between them. (1) X neurons were more
responsive to simple acoustic stimuli than were LMAN neurons (Table 1);
100% of X units tested for frequencies in the 1-5 kHz range showed
onset responses to at least one tone burst (and 63% of these responded
to frequencies covering a 3 kHz range). Only 66% of LMAN neurons
tested for the same frequency range responded to at least one tone
burst. Similarly, >40% of X neurons tested showed responses (usually
onset) to broad-band noise bursts, versus 2% of LMAN units. Both of
these differences between the two nuclei were significant
(p < 0.006,
2 = 7.87, df = 1 for tone bursts, p < 0.0001,
2 = 29.62, df = 1 for noise bursts). (2) In X, the least effective stimuli still tended to evoke weak responses (e.g., Fig.
9b,c) or no response (Figs. 9d,
10d) from neurons, in contrast to the inhibition of LMAN
neurons frequently seen with non-BOS stimuli. Quantitative comparisons
between these properties of X and LMAN neurons are shown in Table 1; a
greater percentage of X neurons than of LMAN neurons was excited by
reversed and especially conspecific stimuli, and a smaller percentage
of X neurons was inhibited by these stimuli (p < 0.001,
2 = 18.20, df = 1 for MAN vs X excitatory
responses to CON; p < 0.0003,
2 = 14.99, df = 1 for MAN vs X excitatory responses to REV;
p < 0.05,
2 = 4.08, df = 1 for MAN
vs X inhibitory responses to CON). The greater response of X neurons to
nonpreferred stimuli and their lack of inhibitory responses were also
evident in the scatterplots of paired RS data from individual neurons
(Fig. 11b vs Fig. 4b); many fewer X neurons had
RS values below zero. (3) Finally, LMAN was more selective than X as
measured by SI (Table 1). A two-way ANOVA with SI as one factor and
nucleus (X and LMAN) as the second factor revealed both a significant
selectivity effect (F(2,164) = 6.92;
p < 0.002) and a significant nucleus effect
(F(1,164) = 33.92; p < 0.0001),
but no significant interaction. The significant nucleus effect and the
lack of significant interaction indicate that the SI values in X were
significantly lower than those in LMAN for all song stimuli. Because X
neurons had higher spontaneous rates than LMAN neurons, an apparent
difference in selectivity could result simply from differences in
spontaneous firing rate. For example, neurons with lower thresholds
(i.e., higher spontaneous rates) might appear less selective, and then
selectivity would apparently emerge as spontaneous rates dropped and
thresholds increased. However, very few X neurons showed inhibitory
responses to songs, whereas a significant fraction of LMAN neurons did. The emergence of inhibition in LMAN cannot be attributable solely to
higher firing thresholds in LMAN and thus is likely to reflect additional (inhibitory) circuitry in or between these two nuclei. In
addition, a regression analysis showed that there was no correlation between spontaneous rate and SI values within either nucleus
(R2 = 0.005-0.028 for X, 0.087-0.189 for LMAN;
in LMAN, these values reflect a slight trend for lower spontaneous rate
neurons to be less rather than more selective). Both the presence of
inhibition and the lack of correlation with spontaneous rate suggest
that the increased selectivity in LMAN is not simply a function of firing rate and therefore of higher thresholds.
In seven birds, I recorded units from both X and LMAN (
n = 19 and n = 26 for X and LMAN units,
respectively), allowing a direct comparison of the response properties
of these neurons. A comparison of SI values for X and LMAN units within
birds confirmed that these were always lower for X than for LMAN
neurons (p < 0.04 for CON, p < 0.03 for REV, paired t tests of mean SI values/nucleus for
each bird). An example of two units from the same bird is shown in
Figure 12. This demonstrates many of the differences
between these neurons suggested by the analysis across birds; both
types of neurons respond well to the BOS, but X neurons tend to fire in
a more sustained and reproducible manner from trial to trial and are
more likely to fire to a second repetition of a motif than LMAN neurons
(Fig. 12a,b). These units also demonstrate the frequent difference between X and LMAN responses to simple stimuli. The
X unit responded both to a broad-band noise burst (with an onset and
somewhat maintained response) (Fig. 12e) and to a 3 kHz tone
burst (with a short latency onset and an off response) (Fig. 12f). In contrast, the LMAN unit from the same bird
was not excited by either of these stimuli (Fig.
12c,d). These cells, like the data from the
nuclei considered as a whole, indicate that much song and order
selectivity is already present at the first stage of the AF pathway,
but that X neurons are in general more broadly responsive than LMAN
neurons.
Fig. 12.
Single LMAN and X units recorded from the same
bird. All histograms to the left of center are from
LMAN, and those to the right are from X. a and b, Response of each neuron to the
BOS; c, e, responses to a broad-band
noise burst; d, f, responses to a 3 kHz
tone burst. Note that the LMAN neuron does not respond significantly to
either simple acoustic stimulus, whereas the X neuron responds to both
with an on and an offset response as well as a more maintained response
to the broad-band noise burst.
[View Larger Version of this Image (38K GIF file)]
Auditory units in juvenile LMAN and X
Auditory units selective for song in a pathway required for song
learning might play a role in the auditory feedback crucial to song
learning. To assess this possibility, it is essential to examine these
neurons in young birds to see whether they have auditory properties
early in learning and if so, to determine whether and how these
properties differ from those of adult neurons. In particular, to
determine whether AF neurons are shaped by sensory experience of the
tutor (and thus could represent the template), it would be ideal to
examine the AF nuclei after song memorization but before sensorimotor
learning. This cannot be done in a straightforward manner in zebra
finches, however, because the rapid development (by 90-100 d)
(Immelmann, 1969
) is accompanied by an overlap between the phases of
learning (Fig. 1a). Thus, there is no normal stage in
finches with a clear separation between sensory and sensorimotor learning. Therefore, I chose to record from young birds partway through
the process of sensory learning and just beginning sensorimotor learning; that is, zebra finches 30-45 d old. Each of these birds was
raised in a soundproof chamber containing only one adult male (the
father), so that its TUT song experience was known. I recorded from the
LMAN and X of these birds in the same manner as for adults, with the
sole difference being that during the experiment, the birds were
presented with TUT instead of the BOS, because they had not yet
developed their own song.
In both juvenile LMAN and X, many units were auditory and responded
well to song, consistent with an auditory role of these neurons during
learning. Their selectivity differed markedly, however, from that seen
in adult birds: they showed neither song nor order selectivity. This
result was evident from 17 single units in juvenile LMAN and 61 single
units in juvenile X, from a total of 14 birds (2-7 units/bird). As in
adult birds, small clusters of units showed properties similar to those
of the single units in both nuclei but were not used for data
analysis.
A typical juvenile LMAN unit is shown in Figure
13a-d and demonstrates that
juvenile neurons were indeed responsive to auditory stimuli. The
rasters of individual spikes in Figure 13 demonstrate the variable
bursty firing of juvenile LMAN neurons, which was even more irregular
and significantly lower in rate than that of adult LMAN neurons (Table
1). The tutor song (Fig. 13a), with which this bird had been
in contact for ~1 month, elicited a significant response from this
neuron (Fig 13a). In sharp contrast to neurons in adult
birds, however, this neuron was not song- or order-selective; CON song
and reversed tutor song elicited responses as strong as those to the
TUT (Fig. 13b,c).
Fig. 13.
Auditory responses of a single LMAN unit from a
juvenile zebra finch. a, Response to the tutor song
played forward, whereas the next two panels show the very similar
responses to the tutor song reversed (b) and to a
conspecific song (c). The very long latency response of
the same juvenile LMAN neuron to a broad-band noise burst is seen in
d.
[View Larger Version of this Image (39K GIF file)]
Like juvenile LMAN neurons, juvenile X neurons were also auditory but
nonselective. A representative example is shown in Figure 14 and demonstrates the response to TUT both forward
and in reverse and to CON. As in adult birds, X neurons had higher
baseline firing rates than LMAN neurons (Table 1); furthermore,
juvenile X neurons had significantly higher firing rates than adult X
neurons (Table 1) (p < 0.01, t97 = 2.639, unpaired t test).
Fig. 14.
Auditory responses of a single X unit from a
juvenile zebra finch. The response to the tutor song forward (TUT)
(a), the tutor song reversed (b), a
conspecific song (c), and a 3 kHz tone burst (d). Note the long latency and the maintained response
to this simple stimulus. The mean spontaneous firing rate of the neuron during each set of trials is indicated by the dashed white
line on each histogram.
[View Larger Version of this Image (66K GIF file)]
The data for the entire set of juvenile X and LMAN neurons is
shown in Figure 15. A one-way ANOVA showed that there
was no significant difference between the mean RS for all song stimuli played to the juvenile units within each of the nuclei (Fig.
15a,d) (F(3,182) = 0.495, p > 0.68 for juvenile X;
F(3,52) =1.125, p > 0.34 for
juvenile LMAN; n = 61 in X and 17 in LMAN for BOS, n = 52 in X and 15 in LMAN for REV, n = 47 in X and 14 in LMAN for CON, n = 26 in X and 10 in
LMAN for RO). As in the adult birds, this was evident at the level of
single neurons. A plot of the RS of individual neurons to different
stimuli shows that they cluster around the line that represents equal
responses to the two stimuli being compared (Fig.
15b,e); 38/52 juvenile X neurons and 11/12
juvenile LMAN neurons showed no significant difference (p > 0.05) in response to TUT versus REV, and
36/47 juvenile X and 11/11 juvenile LMAN neurons showed no significant
difference in response to TUT versus CON. Similarly, all of the SI
values calculated for the same neurons also cluster around 0.5 (Table 1). A plot of the distribution of the individual neuron SI values for
each nucleus in juvenile birds demonstrates further that the majority
of single neurons in juvenile birds are not selective (Fig.
15c,f) and illustrates how different the
selectivity of these neurons is from that in adult birds. Analysis of
the response selectivity with respect to the range of ages of birds
examined (29-45 d posthatch) showed no correlation with the age of the juvenile birds (for X, R2 = 0.00006-0.029,
p > 0.22-0.96; for LMAN, R2 = 0.003-0.107, p > 0.23-0.88).
Fig. 15.
Summary data for juvenile X and LMAN neurons.
Histograms of mean RS of all juvenile X neurons (a) or
LMAN neurons (d) to the tutor song forward
(TUT), conspecific song
(CON), tutor song reversed (REV), and tutor song in reverse order
(RO). Error bars indicates SEM. For all individual
juvenile X (b) and LMAN units (e) for which these stimuli were recorded, the RS to TUT is plotted versus the
RS to CON (open squares) or RS to REV (solid
diamonds). The cumulative percentage of cells at each SI for
both adult and juvenile neurons in X (c) and LMAN
(f); squares, juvenile
SI(BOSvsCON); circles,
juvenile SI(BOSvsREV);
triangles, adult
SI(BOSvsCON); diamonds, adult
SI(BOSvsREV).
[View Larger Version of this Image (34K GIF file)]
Simple acoustic stimuli were also much more effective at driving
juvenile LMAN and X neurons than was true in adult birds. More than
40% of juvenile LMAN neurons and 95% of juvenile X neurons responded
to broad-band noise bursts, and these responses had long latencies and
were more sustained than those in adult AF nuclei (Fig. 13d,
Table 1) (both nuclei in juveniles are significantly different from
adult, p < 0.0001,
2 = 25.14, df = 1 for LMAN, p < 0.001,
2 = 22.05, df = 1 for X). The tone burst responses from juvenile LMAN and X
neurons were also different from those in adults; these responses had
much longer latencies (100-200 msec for LMAN and 50-70 for X) and
were more sustained than the tone onset or on-off responses seen in
adults (compare Fig. 14d with Figs. 3d,
5c, 12f). Unlike adult birds, none of the
SI values were different between juvenile LMAN and X (Table 1);
however, as in adult birds, significantly more neurons in juvenile X
responded to noise and tone bursts than did neurons in juvenile LMAN
(p < 0.009,
2 = 6.95, df = 1 for tone bursts; p < 0.0001,
2 = 16.60, df = 1 for noise bursts).
DISCUSSION
This study demonstrates that in adult finches, nuclei necessary
for song learning contain auditory neurons that are highly selective
for the BOS and are sensitive to its temporal structure. Neurons with
these properties are well suited for recognizing the spectrally and
temporally rich information in complex vocalizations such as birdsong.
The same nuclei also contain auditory neurons in young finches, early
in the process of learning song. The auditory neurons in juvenile birds
are strikingly different from those in adult birds; however, they show
no song or order selectivity, demonstrating that the selective
properties of these neurons must emerge during development, in parallel
with vocal learning. These neurons represent one of the clearest
examples of experience-dependent shaping of neuronal selectivity to
complex stimuli. Moreover, the sensory properties of these AF nuclei in
both young and adult birds are consistent with an auditory role of this
circuit in the song-learning process.
Selectivity of adult LMAN neurons
Adult LMAN neurons are song-selective, responding better to the
BOS than to any other stimuli, including songs of conspecifics. They
are very similar to the song-selective neurons described in the
sensorimotor song nucleus HVc (Margoliash, 1983
, 1986
; Margoliash and
Fortune, 1992
; Margoliash et al., 1994
; Sutter and Margoliash, 1994
;
Lewicki and Konishi, 1995
; Lewicki and Arthur, 1996
; Volman, 1996
; see
also below). Like those neurons, the selectivity of LMAN neurons has
several aspects. For one, spectral cues are important; consistent with
this, single tone bursts or specific syllables of conspecific songs
could elicit phasic responses. The same neuron's response to the BOS
was always of much longer duration, however, and was sustained over
several syllables; these neurons thus respond to more than a single
feature of the BOS.
A second important aspect of the selectivity of LMAN neurons, like
those in HVc, is their sensitivity to temporal context. Even if all the
spectral features of the BOS were unchanged, altering the order of
these cues dramatically decreased the response of LMAN neurons. This
was true not only when all the cues were reversed (by completely
reversing the song), but also when the syllables of the song were
played in reverse order. This manipulation preserves the local order of
each syllable but alters the global context, thus demonstrating that
the sensitivity to order is not just local but extends across syllable
boundaries.
LMAN neurons also show temporal combination sensitivity, another level
of auditory context sensitivity, in which neurons respond strongly to a
combination of syllables of the BOS, in a particular order, even when
they fail to respond or respond much less to the individual syllables
or subsets of syllables in isolation. Similar properties have been
described for LMAN neurons in experiments using a single call-like
syllable and its components as stimuli (Saito and Maekawa, 1993
).
Temporal combination sensitivity is also seen in song-selective neurons
in HVc (Margoliash, 1983
; Margoliash and Fortune, 1992
; Lewicki and
Konishi, 1995
) as well as in auditory neurons in bats (O'Neill and
Suga, 1979
; Suga, 1990
). An additional feature of the temporal
combination sensitivity noted here is that the same unit could show
combination-sensitive responses to several different combinations
within the BOS. Not all of these responses were evident when the entire
song was played, but multiple different portions of the song played in
isolation could evoke robust responses. This property might be useful
for piece-wise recognition of song.
The fact that the firing of LMAN song- and order-selective neurons is
sustained over multiple syllables has implications for the neural
encoding of responsiveness to the BOS; peak firing rate on a short time
scale (<100 msec) was not always greatest for the BOS, but the mean
firing rate over a longer time window (>500 msec) was almost always
greatest for the BOS. Responses of LMAN neurons, like those of some
other high-level sensory neurons (Gross, 1972; Newman and Wollberg,
1973
; Desimone et al., 1984
; Richmond et al., 1987
), also had long and
very variable latencies, did not occur reliably on every trial, and
were not tightly time-locked to the stimulus. Thus, precise spike
timing over short time frames seems not to be a mechanism by which LMAN
neurons encode song.
Another feature of LMAN neurons is their similarity in response
properties throughout the nucleus of an individual bird. The number of
single LMAN neurons sampled from an individual bird was often small,
however, and to be certain of this lack of topography, a more
exhaustive mapping of LMAN responses will be necessary. Nonetheless,
within the limits of these sample sizes, there is not an obvious
"library" of different LMAN neurons, each tuned to particular
features of a bird's song. Instead, many neurons appear to be tuned to
a similar set of features of the BOS. This is also true for neurons in
HVc (Sutter and Margoliash, 1994
). This result contrasts with a
topography for auditory response properties that might have been
expected in LMAN, given the recently described topography of LMAN
projections to the RA (Johnson et al., 1995
).
Comparison of LMAN and HVc neurons
The major source of auditory input to LMAN is from HVc (via X)
(Katz and Gurney, 1981
) (A. Doupe, unpublished observations), and thus
the selectivity observed in adult LMAN likely reflects that of neurons
in HVc. The song-selective neurons seen in adult LMAN are in fact
strikingly similar to those in HVc in many respects, including their
spectral and temporal selectivity for the BOS, their temporal and
harmonic combination sensitivity, and their lack of topography
(Margoliash, 1983
, 1986
; Margoliash and Fortune, 1992
; Margoliash et
al., 1994
; Sutter and Margoliash, 1994
: Lewicki and Konishi, 1995
;
Lewicki and Arthur, 1996
; Volman, 1996
). Numerous individual neurons in
HVc show selectivity equal to that of the LMAN neurons presented here,
including inhibition by non-BOS stimuli. Although it is difficult to
compare data collected in different laboratories and in different
nuclei, comparison of LMAN selectivity indices for the BOS versus REV
or CON with indices calculated similarly for HVc suggests that these
are similar for the two nuclei, although slightly higher for LMAN than
for HVc (Margoliash and Fortune, 1992
; Margoliash et al., 1994
; Volman,
1996
). HVc also seems to be more heterogeneous than LMAN; a number of
studies (Margoliash, 1983
; Saito and Maekawa, 1993
; Lewicki and Arthur, 1996
) have described neurons in HVc that respond well to simple stimuli
or equally to forward and reversed song, whereas very few such neurons
are found in LMAN. Lewicki and Arthur (1996)
found that only 50% of
song-responsive HVc neurons had a significantly greater response to
forward than reversed song, versus 83% using the same criterion in the
present study of LMAN. Anatomical and neurophysiological studies have
shown that HVc contains two separate populations of auditory projection
neurons, one to RA and the other to the AF (Katz and Gurney, 1981
;
Gahr, 1990
; Doupe and Konishi, 1991
; Kirn et al., 1991
; Sohrabji et
al., 1993
; Vicario and Yohay, 1993
). Thus, the properties of LMAN
actually reflect a subset of HVc neurons, but whether the
song-selective properties of this HVc subpopulation differ from the
properties of RA-projecting neurons or from HVc as a whole has never
been systematically studied.
Selectivity of adult X neurons
X neurons are interposed between two song-selective nuclei, HVc
and LMAN, and thus might be expected to be very similar to neurons in
these areas. Neurons in X are indeed very song- and order-selective,
but their properties differ to some extent from those of both HVc and
LMAN neurons. X neurons tend to be more broadly responsive to a variety
of stimuli, including simple tone bursts and conspecific songs, and
they are less likely than HVc and especially LMAN neurons to be
inhibited by nonpreferred stimuli. This apparent loss of selectivity of
X neurons relative to their inputs may result primarily from their high
spontaneous rates (and thus perhaps lower thresholds for responding);
moreover, the effects of inhibitory responses to song in HVc cannot be
fed forward. Because there are two classes of auditory projection neurons in HVc, however, it is also possible that larger numbers of HVc
neurons with simpler auditory response properties project to the AF
pathway. Recent results with intracellular recording and filling of HVc
neurons, however, suggest that at least some of the X-projecting HVc
neurons are very song-selective (Lewicki, 1996
).
Because LMAN neurons are more narrowly responsive and more likely to
show inhibitory responses than X neurons, and because X provides their
only known auditory input, the circuit between X and LMAN may create
these differences in response properties, perhaps via inhibitory
processing in X, LMAN, or the intervening thalamic nucleus DLM. In many
sensory systems, gradual increases in stimulus selectivity are the
result of hierarchical circuits such as the AF pathway (DeYoe and Van
Essen, 1988
; Konishi et al., 1988
; Livingstone and Hubel, 1988
; Rose et
al., 1988
). Nonetheless, the differences in selectivity along the adult
AF are slight, and the functional implications of a sequence of very
song-selective nuclei remain unclear.
X has recently been shown to receive a second song system input from
recurrent collaterals from LMAN neurons (Nixdorf-Bergweiler