Previous Article | Next Article 
The Journal of Neuroscience, June 1, 1999, 19(11):4559-4584
Contributions of Tutor and Bird's Own Song Experience to Neural
Selectivity in the Songbird Anterior Forebrain
Michele M.
Solis and
Allison J.
Doupe
Keck Center for Integrative Neuroscience and Neuroscience Graduate
Program, Departments of Physiology and Psychiatry, University of
California, San Francisco, San Francisco, California 94143-0444
 |
ABSTRACT |
Auditory neurons of the anterior forebrain (AF) of zebra finches
become selective for song during song learning. In adults, these
neurons respond more to the bird's own song (BOS) than to the songs of
other zebra finches (conspecifics) or BOS played in reverse. In
contrast, AF neurons from young birds (30 d) respond equally
well to all song stimuli. AF selectivity develops rapidly during song
learning, appearing in 60-d-old birds. At this age, many neurons also
respond equally well to BOS and tutor song. These similar neural
responses to BOS and tutor song might reflect contributions from both
song experiences to selectivity, because auditory experiences of both
BOS and tutor song are essential for normal song learning.
Alternatively, they may simply result from acoustic similarities
between BOS and tutor song. Understanding which experience shapes
selectivity could elucidate the function of song-selective AF neurons.
To minimize acoustic similarity between BOS and tutor song, we induced
juvenile birds to produce abnormal song by denervating the syrinx, the
avian vocal organ, before song onset. We recorded single neurons
extracellularly in the AF at 60 d, after birds had had substantial
experience of both the abnormal BOS (tsBOS) and tutor song. Some
neurons preferred the unique tsBOS over the tutor song, clearly
indicating a role for BOS experience in shaping neural selectivity. In
addition, a sizable proportion of neurons responded equally well to
tsBOS and tutor song, despite their acoustic dissimilarity. These
neurons were not simply immature, because they were selective for tsBOS
and tutor song relative to conspecific and reverse song. Furthermore,
their similar responses to tsBOS and tutor song could not be attributed
to residual acoustic similarities between the two stimuli, as measured
by several song analyses. The neural sensitivity to two very different
songs suggests that single AF neurons may be shaped by both BOS and
tutor song experience.
Key words:
auditory selectivity; song selectivity; experience-dependent plasticity; NXIIts transections; LMAN; Area X; zebra finch
 |
INTRODUCTION |
Songbirds, much like humans, depend
on auditory experience during early life to learn their vocal behavior.
This learning occurs in two stages, called the sensory and sensorimotor
phases (Fig. 1A).
During the sensory phase, a young bird listens to and memorizes the
song of its tutor; this memory is called the "template." The
sensorimotor phase begins with the onset of singing; using auditory
feedback, the juvenile compares its immature vocalizations with the
tutor song template and gradually modifies the plastic song until it
produces a mature "crystallized song," which is highly stereotyped
and resembles the tutor song. Thus, experience of both the tutor song
and the bird's own song (BOS) is required for normal song learning
(Konishi, 1965
; Price, 1979
).

View larger version (24K):
[in this window]
[in a new window]
|
Figure 1.
A, Zebra finches learn to sing in
two overlapping phases. The sensory phase ends at ~60 d; the
sensorimotor phase begins at ~30 d and continues until 90 d.
B, Anatomy of the song system. Motor pathway nuclei are
gray, and the AF nuclei are black.
C, AF neurons develop selectivity for song during
development. At 30 d, LMAN neurons have equal RS to tutor song
(TUT), conspecific song
(CON), and reverse tutor song
(REV). At 60 d, these neurons respond
significantly more to TUT than to CON or to REV. In addition, BOS
elicits a stronger RS than CON or reverse BOS
(REV). In adults, LMAN neurons are extremely
selective for BOS. D, At 60 d, there is a range of
BOS versus tutor song preferences among LMAN neurons. The cumulative distribution of
preferences is shown, as quantified with a
d'BOS-tutor value for each neuron (see
Materials and Methods). Neurons with values 0.5 are considered to
prefer BOS over tutor song, and neurons with values of 0.5 or less
are considered to prefer tutor over BOS. Gray shading
highlights those values for which there was no strong preference for
one song over the other ( 0.5 < d'BOS-tutor < 0.5).
|
|
Likely candidates for circuits involved in processing BOS and tutor
song experience during learning lie within the song system, a group of
nuclei dedicated to song learning and production (Fig. 1B). The motor pathway, which is necessary for normal
song production throughout life, includes HVc, the robust
nucleus of the archistriatum (RA), and the tracheosyringeal portion of
the hypoglossal nucleus (nXIIts). The nXIIts contains the motor neurons
innervating the muscles of the syrinx, the avian vocal organ. RA also
projects to a group of nuclei associated with respiration, such as
nucleus retroambigualis (RAm) and nucleus paraambigualis (PAm) (Wild, 1993
, 1997
; Reinke and Wild, 1998
); these participate in vocalization by controlling the respiratory musculature involved in airflow through
the syrinx. In contrast to the motor pathway, nuclei of the anterior
forebrain (AF) pathway are not required for singing in adulthood, but
play a critical, unknown role during song learning (Bottjer et al.,
1984
; Sohrabji et al., 1990
; Scharff and Nottebohm, 1991
; Basham et
al., 1996
). The AF pathway comprises Area X (X), the medial nucleus of
the dorsolateral thalamus (DLM), and the lateral magnocellular nucleus
of the anterior neostriatum (LMAN), and indirectly connects HVc to RA.
Thus, the AF might process auditory information essential for learning
and might use it to modulate motor pathway activity.
Consistent with an auditory role for the AF during learning, AF neurons
in adult, anesthetized birds are auditory and respond selectively to
BOS (Doupe and Konishi, 1991
). Neurons selective for BOS prefer it to
the songs of other zebra finches (conspecific song) and to BOS played
in reverse. These song-selective neurons resemble those found in HVc
(Margoliash, 1983
), as well as neurons tuned to species-specific
vocalizations found in bats (Suga et al., 1978
; Esser et al., 1997
),
rhesus monkeys (Rauschecker et al., 1995
), and marmosets (Wang et al.,
1995
). AF neurons from young juvenile birds lack selectivity, however,
responding equally well to all song stimuli at 30 d of age (Fig.
1C). Song selectivity develops rapidly, because it is found
in 60-d-old juveniles (Solis and Doupe, 1997
).
Determining the experience responsible for AF neuron selectivity could
elucidate AF function during song learning. For example, neurons tuned
by BOS experience could provide feedback about the current state of
BOS, whereas those tuned by tutor song experience could store tutor
song information. When neural responses to BOS and tutor song are
compared at 60 d, a range of preferences for one song over another
is evident (Fig. 1D; adapted from Solis and Doupe,
1997
). Many neurons prefer BOS over tutor song, suggesting a role for
BOS experience in shaping selectivity. A few neurons prefer tutor song
over BOS, suggesting that they were tuned by tutor song experience.
Finally, many neurons respond equally well to both BOS and tutor song.
These neurons are clearly selective, because they do not simply respond
to any song stimulus. Such neurons could have been shaped by both BOS
and tutor song experience. Alternatively, these neurons might indicate
acoustic similarities between the two songs; by 60 d some
juveniles' plastic songs clearly resemble their tutor song.
If neurons with similar responses to BOS and tutor song result from
acoustic similarities between the two songs, then it is unclear which
song experience is responsible for neural selectivity. Inducing a
juvenile bird to produce an abnormal song could resolve this issue,
because it would reduce similarity between BOS and tutor song (Fig.
2A). If neurons with
equivalent responses to BOS and tutor song result from the similarities
between these two songs, then such neurons should not exist in birds
with songs very different from their tutor song (Fig. 2B,
solid line). Alternatively, if such neurons reflect the
contributions of both song experiences, then neurons with similar
responses to the abnormal song and tutor song should persist (Fig.
2B, dashed line).

View larger version (21K):
[in this window]
[in a new window]
|
Figure 2.
Consequences of decreasing similarity between BOS
and tutor song. A, When a juvenile stores a good copy of
the tutor song (A) as its template
(A) and accurately models its own song after the
template, the resulting BOS (A) will highly
resemble the tutor song. Thus, if a neuron is tuned by BOS experience
only, it could also respond well to tutor song when the two songs are
similar enough. This ambiguity could be resolved by making the BOS very
different (B) from the tutor song.
B, Decreasing the similarity between BOS and tutor song has two
predicted outcomes on the distribution of
d'BOS-tutor values. If BOS experience
shapes some neurons, and tutor song experience shapes others, then the
distribution should be split in two, with some neurons preferring tsBOS
over tutor song and others preferring tutor over tsBOS but none
responding equally well to both (solid line).
Alternatively, if both tsBOS and tutor song influence the neural
properties of single neurons, then neurons with equivalent responses
should persist (dotted line). C, If a
poor copy of the tutor song (A) is stored as the
template (a) and then a good copy of the template
is produced, then the resulting BOS (a) is a
better model of the template than the tutor song itself. In
this case, neurons preferring BOS would nonetheless reflect tutor song
experience. Inducing an abnormal BOS by disrupting sensorimotor
learning (B) should decrease the similarity
between BOS and a song resulting from poor memorization of the tutor
song.
|
|
Birds producing abnormal songs could also clarify the experience
responsible for neurons that prefer BOS over tutor song in normal
60 d birds. The simplest interpretation is that these neurons are
shaped by BOS experience. If, however, a bird has poorly copied the
tutor song during the sensory phase, then these neurons might instead
represent the template. This possibility is schematized in Figure
2C; if a bird stores a poor copy of the tutor
(A) as its template (a) and models its own
song accurately after the template (a), then BOS itself is a
better representation of the template than the tutor song. This issue
could be resolved with birds induced to produce very abnormal songs; if
neurons preferring BOS over tutor song persist in such birds, then it
is likely that they result from experience of the song unique to that bird.
In this study, we minimized the similarity between the songs of
juvenile birds and their tutors by transecting the tracheosyringeal portion of the hypoglossal nerve [NXIIts (ts)], which innervates the
syringeal muscles, before song onset. Extracellular recordings of
single LMAN and X neurons in these birds at 60 d showed that, although the BOS and tutor song were now acoustically very different, many neurons still responded equally well to both stimuli. This result
is similar to that found in normal 60 d birds and suggests a role
for both song experiences in shaping AF selectivity.
 |
MATERIALS AND METHODS |
Animals. Experiments used male juvenile zebra finches
(Taeniopygia guttata). The care and treatment of
experimental animals was reviewed and approved by an university animal
care and use committee at University of California, San Francisco
(UCSF). Birds were raised in individual cages, with their parents and
siblings from the same clutch. Opaque dividers between cages visually
isolated birds from other conspecifics in the colony. Because juvenile birds shared a cage with a single adult male tutor and were visually isolated from other conspecifics within earshot, their learning should
have been restricted to the tutor in their cage (Immelmann, 1969
;
Eales, 1987
; Eales, 1989
; Williams, 1990
).
Surgery. When birds were 26-33 d old (mean ± SD,
28 ± 2 d), the tracheosyringeal portion of the hypoglossal
nerve (NXIIts) was transected bilaterally under isofluorane anesthesia
[0.5-1.5% (v/v); Abbott Laboratories, North Chicago, IL]. The
nerves were exposed by an incision along the skin of the neck, where
lidocaine had been injected subcutaneously (2% solution; Elkins-Sinn,
Cherry Hill, NJ). The NXIIts nerve was dissected away from the trachea at the proximal end of the incision and cut; dissection then continued along the length of the neck, and the nerve was pulled to remove the
distal end. This removed ~1 cm of nerve. After bilateral
transections, the skin was closed with skin adhesive (Krazy Glue;
Borden, Columbus, OH). The ts cut birds were returned to their home
cages until they were 60-d-old.
Two days before the experiment, we prepared birds for recording by
affixing a head post to the skull and marking the location of the song
nuclei on the skull (for details, see Solis and Doupe, 1997
). On the
day of the experiment, the bird was anesthetized with a 20% solution
of urethane (5 ml/kg, i.m.; Sigma, St. Louis, MO; delivered in three
injections at 30 min intervals), placed in the stereotaxic apparatus,
and immobilized via its head post. Body temperature was regulated with
a temperature controller (FHC, Brunswick, ME). A craniotomy was
performed above LMAN and X, the dura was opened, and the electrode was
lowered into the brain with a microdrive (Fine Science Tools, Foster
City, CA).
Stimuli. One to 2 d before the experiment, the songs of
the ts cut bird and its tutor were recorded. Each bird was placed in a
sound-attenuated chamber (Acoustic Systems, Austin, TX) connected to an
automatically triggered audio system. Approximately 90 min of bird
sounds were recorded and then scanned for song. A typical plastic song
rendition was usually chosen after listening to at least 25 songs and
looking at several song spectrograms; a typical song was considered to
be the song most frequently sung. A typical tutor song was chosen after
listening to 10 songs. Songs were digitized at 32 kHz and stored on a
SPARC (Sun Microsystems, Palo Alto, CA) IPX computer at similar
peak intensity levels (range, 64-73 dB; software by Michael Lewicki
and Larry Proctor, California Institute of Technology, Pasadena, CA).
In 15 experiments, three different plastic song renditions from a bird
were stored for presentation during the experiment. The durations of
tsBOS and tutor songs ranged from 602 to 2461 msec.
During electrophysiological recording, acoustic stimuli were presented
by a speaker 25 cm away from the bird, inside a double-walled anechoic
sound-attenuated chamber (Acoustic Systems, Austin, TX). The frequency
response measured at the bird's location inside the chamber was flat
(±5.0 dB) between 500 Hz and 8 kHz. The stimuli included songs of the
ts cut juvenile (tsBOS), its tutor song, reverse manipulations of tsBOS
and tutor song, the songs of other zebra finches (conspecific), the
acoustically similar songs of other species of estrildid finches
(heterospecific), broad-band noise bursts, and tone bursts. Stimuli
were presented in a random, interleaved manner. An effort was made to
present each neuron with 15-20 trials of each stimulus type: tsBOS,
reverse tsBOS, reverse order tsBOS, tutor, reverse tutor, reverse order
tutor, at least two adult conspecific songs, at least two
heterospecific songs, at least two juvenile conspecific songs, at least
two ts cut juvenile conspecific songs, broad-band noise bursts, and
tone bursts; however, some neurons were lost before characterization was completed.
Electrophysiology. Extracellular neuronal signals were
amplified and filtered between 300 Hz and 10 kHz (A-M Systems, Everett, WA). To locate auditory neurons, search stimuli included tsBOS, tutor
song, adult conspecific song, heterospecific song, broad-band noise
bursts, and tone bursts. Most neurons were isolated with a window
discriminator (UCSF Physiology Shop). Twelve units were isolated
offline using spike-sorting software (Lewicki, 1994
). To do this,
waveforms were recorded during stimulus presentation during the
experiment. Later, spike models were constructed from waveforms
recorded at an intermediate time during stimulus presentation. These
spike models were then used to classify spikes within the rest of the
waveforms. Both spike model construction and template-matching algorithms were based on Bayesian probability theory. Neuronal responses were collected and analyzed by a SPARC IPX computer using
software developed by Mike Lewicki and Larry Proctor (California Institute of Technology) and Frédéric Theunissen (UCSF).
Electrolytic lesions were made at selected locations for reconstructing
recording sites.
Anatomy. At the end of an experiment, the bird was deeply
anesthetized with Metofane (Pitman-Moore, Mundelein, IL) and
transcardially perfused with 0.9% saline, followed by 3.7% formalin
in 0.025 M phosphate buffer. Brains were post-fixed and cut
in 40 µm sections with a freezing microtome. Sections were stained
with cresyl violet, and electrode tracks and lesions were identified.
Only neurons histologically confirmed to be in LMAN or X were used;
their specific location within each nucleus was also documented.
RA volumes were measured for each ts cut bird and for normal 60 d
birds, recorded in a previous study. Measurements were made blind to
the experimental condition. The Nissl-defined boundaries of RA were
traced at 80 µm intervals, and the resulting area was calculated
using an image analysis program (NIH Image). The total area was
multiplied by section thickness and then by the total number of
sections to give a final volume. Because of individual differences in
post-fixation time, each RA volume was normalized by the volume of the
nucleus pretectalis (PT), which is unrelated to the song system. Final
RA/PT ratios were compared between ts cut and normal birds. When
measurements from both hemispheres were available, the mean RA volume
and mean PT volume were used. For nine ts cut birds, PT volume was not
available. Thus, RA volumes alone were also compared within all ts cut
birds for which post-fixation times were equivalent.
The syrinx of each ts cut bird was also dissected after perfusion. Each
syrinx was cut 1 mm distal and 4 mm proximal of the bifurcation of the
bronchi and then weighed to assess relative muscle mass, a marker of
denervation success.
Data analysis. We quantified responses to an acoustic
stimulus during the period of stimulus presentation, offset by an
estimate of the latency. The latency of each neuron was measured by
examining its responses to a broad-band or tone burst stimulus with a
peristimulus time histogram (PSTH) divided into 5 or 10 msec bins. The
latency was defined as the onset of the first of two consecutive bins during the stimulus that had at least twice as many spikes as the mean
number of spikes per bin during the background. LMAN neurons often did
not respond to broad-band noise or tone bursts. For these cases, the
latency of another neuron from the same bird was used; if there was
none, then the neuron was assigned a latency characteristic of neurons
from normal 60 d birds (65 msec; from Solis and Doupe, 1997
).
To be considered auditory and included for analysis, a neuron had to
have an average firing rate during one of the stimuli that was
significantly different from the background rate (two-tailed paired
t test, p < 0.05). The firing rate during a
stimulus was obtained by normalizing the number of spikes elicited
during the stimulus by the duration of the stimulus. The background
rate was calculated by averaging the firing rate of the neuron from two
different periods: 2 sec preceding stimulus onset and 2-3 sec
beginning 1 sec after the end of the stimulus. The response strength
(RS) of a neuron to a stimulus was the difference between the firing
rate during the stimulus (offset by the latency) and the background
rate. The RS was measured for each stimulus trial and then averaged
across trials to get the neuron's RS to that stimulus, expressed in
spikes per second. Data for different stimuli but of the same stimulus
type were also averaged in this way to get an RS for a stimulus type;
e.g., to obtain the RS for adult conspecific song, the RS values for
each trial of two different adult conspecific song stimuli were
averaged together.
The selectivity of an individual neuron for one stimulus (A) over
another (B) was quantified using the d'A-B
measure (Green and Swets, 1966
), where:
In this equation,
and
are the mean RS to
stimulus A and B, respectively, and
2 is the variance of
each RS. If d'A-B is positive, then stimulus A
elicited a greater response; if it is negative, then stimulus B
elicited a greater response. Values of d'A-B close to 0 indicate no difference in the RS elicited by the two stimuli. A particular d' value was calculated only for
neurons that had a significant response to at least one of the two
stimuli compared. A neuron was considered selective for stimulus A over stimulus B if it had a d'A-B value
0.5. This
criterion was based on the observation that neurons with a
d'A-B value
0.5 usually had an RS to stimulus
A that was at least twice as great as that to stimulus B (Solis and
Doupe, 1997
). Also, a d'A-B value of 0.5 corresponds to a significantly greater response to stimulus A than to
stimulus B, based on a paired t test with 20 presentations
of each stimulus (p = 0.031).
To convey the magnitude of the difference between the RS elicited by
two different stimuli, the selectivity index (SI) was also calculated
(Volman, 1996
; Doupe, 1997
). The SI compared the mean RS with each
stimulus in ratio form:
|
|
When comparing RS to two stimuli with large differences in song
duration, normalizing spike counts elicited by the two stimuli by
stimulus duration may bias comparisons of the RS. For example, if two
stimuli, one short and one long, elicit a similar response in which the
neuron initially fires strongly and then fatigues, then normalizing by
song duration will give a substantially decreased RS for the long
stimulus relative to the shorter stimulus; this in turn will result in
a d' value that prefers short stimuli over long stimuli.
Because large differences in song duration occurred in several
experiments, a peak RS was also calculated to remove bias attributable
to varying song durations in the comparisons of a neural response.
First, a maximum firing rate during the stimulus was found using a 500 msec sliding window, which moved across a response in 1 msec
increments. Second, the maximum background rate was also found using a
500 msec window. Third, the peak RS was calculated by taking the
difference between the maximum firing rate during the stimulus and the
maximum background rate; this peak measurement removes duration bias
because it normalizes every spike count by 500 msec, regardless of the
stimulus duration. Finally, peak d' values were also
calculated using the peak RS obtained from the 500 msec window. A 500 msec window was chosen for two reasons. First, it was shorter than the
shortest song stimulus (602 msec). Second, for a subset of neurons
(five from LMAN and five from X), a series of sliding windows (10-2000
msec) were used to calculate the peak RS and resulting peak
d' values. Among those windows <600 msec, the 500 msec
window gave the largest peak d' values between two stimuli
of similar durations. For some cells, windows >500 msec resulted in
d' values higher than those for short windows (our
unpublished data); this indicates that peak d' measures can
underestimate the selectivity of a cell.
Cluster analysis. We tested whether the
d'tsBOS-tutor values of neurons recorded from
each bird were more similar than expected by chance. To do this, the
variance of the d'tsBOS-tutor values obtained
experimentally from each bird was compared with a simulated
distribution of variances created from the data from all birds. This
distribution was determined from 1000 Monte Carlo simulations; each
simulation randomly selected n
d'tsBOS-tutor values from the pool of all
experimental d'tsBOS-tutor values (includes all
cells from all birds) and calculated their variance (n
equals the number of cells recorded in each bird). The median of the
resulting distribution of simulated variances was compared with each
bird's experimental variance. If the experimental variance was
significantly less than the median of the simulated distribution (one-sample sign test, p < 0.05), the
d'tsBOS-tutor values from that bird were
considered clustered. A sign test determined whether the frequency of
clustering in the group of birds studied was greater than expected by
chance. This procedure was completed for
d'tsBOS-tutor values from LMAN neurons alone, X
neurons alone, and both neuron types together.
Song analysis: similarity. Once electrophysiology
experiments were completed, we analyzed the tsBOS and tutor songs
themselves using several methods. Song is composed of syllables, which
are continuous acoustical signals, 10-200 msec in duration. Syllables are separated from other syllables by a sudden fall in amplitude to
near zero or by brief silent intervals. Syllables are composed of
smaller continuous signals called "notes." A repeated sequence of
syllables is a "motif." A song "bout" consists of introductory notes followed by one or more motifs (for detailed song descriptions, see Price, 1979
; Sossinka and Bohner, 1980
).
The first song analysis was a matching task, completed by nine human
observers familiar with zebra finch song but blind to the neural
properties of each bird. Observers tried to match each experimental
song with that of its tutor, which was present among a group of six
potential tutors. The observers listened to and looked at sonograms and
oscillograms of the songs before selecting the tutor song that best
matched the experimental song. Thus, the percentage of observers that
correctly matched the experimental song to its tutor song indicated the
overall similarity between tsBOS and tutor song; this measure was
called the "percent correctly matched." After selecting a "best
match" tutor song, observers scored the song pair on spectral
similarity and on temporal similarity using a scale from 1 to 5. For
spectral similarity, observers only considered syllable morphology and
sequence. A score of 1 referred to a song pair for which no elements in
the experimental song resembled anything in the best match tutor
song; 2 was given to a song pair when some notes in the experimental
song resembled notes present in the best match song; 3 designated a
song pair in which one or more syllables of the experimental song
resembled distinctive syllables of the best match song; 4 referred to a song pair for which several experimental song syllables resembled those
of the best match song, and the syllable sequences were somewhat
similar; and 5 was given to a song pair when the experimental song
resembled the best match song in both syllable morphology and sequence,
making it a good copy of the best match song.
To judge temporal similarity, observers disregarded the spectral
features of song and considered only the durations of syllables and
intervals and their patterns, or rhythm, within the songs. Each song
pair was scored on a scale of 1 to 5. A score of 1 referred to a song
pair for which a timing similarity between the experimental song and
the best match song could not be detected; 2 indicated a song pair for
which the relative durations of at least two syllables and the interval
between them in the experimental song resembled timing in the best
match song (e.g., doublets or triplets were heard in both songs); 3 was
given to a song pair when combinations of doublets or triplets in the
experimental song resembled the timing structures of the best match
song; 4 was given to a song pair when many syllables and intervals of
the experimental song had relatively similar duration and patterning as
those in the best match song; and 5 indicated a song pair for which the
timing of the experimental song was highly similar to that of the best match song, although differences in speed may have been apparent.
Songs of non-ts cut birds were also included among the experimental
songs for analysis, and their respective tutor songs were also present
among the possible tutor choices; this provided references against
which ts cut song similarity scores could be compared. Normal 60 d
song (n = 16), normal adult song (n = 9), and randomly matched song (songs for which the correct tutor was
not present among the possible tutor choices; n = 6)
were also matched to a tutor song and scored for spectral and temporal
similarity. Scores given to normal adult songs provided an upper bound
of similarity between songs from normal adults and their tutors, whereas scores given to randomly matched songs provided a lower bound
of similarity. Randomly matched songs included those from two normal
adult, two normal 60 d, and two ts cut 60 d birds.
To control for slight scoring differences between observers, we
normalized each observer's score for a song by the observer's mean
score for all songs. Thus, if an observer scored the spectral similarity of a song pair as a 5, but the observer's mean score was a
3, then the score for this particular song pair was 5/3 = 1.7. The
normalized scores for birds ranged from 0.30 to 2.43. The final score
for each song was the average of each observer's normalized score.
This final score included scores given to incorrect experimental-tutor
song matches. Scores calculated with incorrect matches excluded were
not significantly different (paired t test, p < 0.05); this indicates that incorrectly chosen
tutor songs were as dissimilar from the experimental song as the tutor
song itself. The mean score for song type (i.e., ts cut, normal 60 d, adult control, and randomly matched) was calculated from the final
scores for each song belonging to the song type.
The similarity between each experimental song (ts cut, normal 60 d, normal adult, and randomly matched songs) and its tutor song was
also measured with a cross-correlation algorithm (Theunissen and Doupe,
1998
). One song waveform was moved relative to another in 1 msec
increments, and an r2 value was
calculated for each time delay. The maximum was used as the
"cross-correlation measure." Unlike the spectral and temporal similarity scoring in the matching task, cross-correlations were done
between an experimental song and the correct tutor song (except for
randomly matched songs; these were cross-correlated to the tutor song
most often chosen by observers in the matching test).
To measure overall similarity, the entire spectrogram of an
experimental song was cross-correlated to the entire spectrogram of the
tutor song. To measure spectral similarity, the "syllables-only" cross-correlation measure was calculated for each song pair. For this,
each isolated syllable of the experimental song was compared with each
isolated syllable of the tutor song. The cross-correlation measure was
calculated for each comparison, and the maximum was taken as the best
match for the syllable. The resulting maxima were then averaged to
produce the syllables-only cross-correlation measure. To measure
temporal similarity, each song waveform was rectified and low-pass
filtered at 62.5 Hz. The filtered versions of experimental song and
tutor song were then cross-correlated to give the "temporal
envelope" cross-correlation measure.
To further compare temporal features of song, overlap values were
calculated between these song pairs (program by Michael Brainard,
UCSF). For this, the syllables of each song were replaced with square
pulses of equal amplitude. The resulting square pulse strings preserved
syllable and interval durations and their patterns found in the
original songs. The square pulse string of an entire experimental song
was then compared with that of the entire tutor song by calculating the
percent overlap between syllables and intervals. The proportion of
overlap between experimental syllables and tutor song syllables was
calculated separately from the proportion of overlap between
experimental intervals and tutor song intervals. The mean of the
syllable and interval overlap values was the "song-song overlap" value.
In addition, a "motif-song overlap" value was calculated, which
maximized the chance of overlap. The song-song overlap measure described above could miss timing similarities between motifs of two
songs, if there were different intervals between multiple motifs within
a song. To avoid this, the motif-song overlap value compared a string
based on a single motif of the experimental song with a string based on
the entire tutor song. In addition, the song-song overlap measure
could miss timing similarities if there were differences in song speed;
thus, the motif-song overlap calculations allowed the motif string to
stretch proportionately 80-120% of its original length, in 2%
increments. The percent overlap between each stretched version of the
motif string and the tutor song string was calculated, and the maximum
was taken as the "maximum overlap" value. Finally, overlap values
are sensitive to the complexity of the motif string of the experimental
bird. For example, a simple motif comprising only two syllables is
likely to give a high maximum overlap value for both the tutor song and a random song. To correct for this, the maximum overlap value was
normalized by how well the motif overlapped with random songs. To
obtain a measure for random overlap, the maximum overlap value was
determined between the motif string and 20 randomly chosen, normal
adult song strings. The mean of the 20 maximum overlap values gave the
"random overlap" value. This random overlap value was used to
normalize the maximum overlap value obtained from the comparison of the
motif and tutor song strings, such that:
|
|
Song analysis: stereotypy. We measured song
stereotypy of each bird in three ways: human subjective scoring,
syllables-only cross-correlations, and motif-song overlap analysis.
For reference, songs of normal adult and normal 60 d birds were
included in the stereotypy test. For each bird, 10 song bouts were
randomly selected for analysis (except in four cases: two normal
60 d birds had two songs each; one normal 60 d bird had five
songs; and one ts cut 60 d bird had only three songs).
Three observers rated how consistently a particular motif was present
in each song sample from a single bird on a scale from 1 to 5. They
listened to each song sample and looked at their accompanying sonograms
and oscillograms before deciding on the score. Both spectral and
temporal pattern repeats contributed to the score. A score of 1 referred to a group of songs that were not at all stereotyped: short
syllable sequences and small temporal patterns were rarely, if at all,
repeated in the song samples. A score of 2 indicated that a particular
syllable sequence or brief temporal pattern was repeated in half or
fewer of the song samples. A 3 was given when a short syllable sequence
or temporal pattern was repeated in almost all or all song samples.
Alternatively, a 3 was given if an entire motif structure was repeated
in only half of the song samples. The syllables outside of the repeated structures could vary in identity and ordering. A score of 4 was given
when an entire motif structure was apparent in most or all of the song
samples; however, some variability remained, with syllables added or
dropped from the motif in different renditions. A score of 5 was given
when identical motifs were found in every song sample. Each score was
normalized by the observer's mean score, as described for the
similarity scoring in the matching task. Normalized stereotypy scores
ranged from 0.26 to 1.40.
To isolate spectral stereotypy, we used cross-correlations to measure
how consistently the syllables in one song were present in the other
song samples. Spectral stereotypy was calculated in the same manner as
the syllables-only cross-correlation measure of similarity described
above, except that the cross-correlations were done between syllables
from songs of the same bird. The mean of the resulting syllables-only
cross-correlation measures (usually nine coefficients) gave the
spectral stereotypy measure for the bird.
Motif-song overlap analysis was used to measure temporal stereotypy.
To measure how consistently temporal patterns were repeated in song
samples from a bird, a motif-song overlap value was obtained for an
experimental motif string and each song sample string (usually nine
sample strings), as described above. These were also normalized by a
random overlap value, which was obtained by calculating the maximum
overlap between the motif string and nine randomly chosen songs from
all the experimental groups (adult control, 60 d control, and
60 d ts cut). The normalized motif-song overlap values for each
comparison between songs from the same bird were then averaged to give
an overlap stereotypy measure.
 |
RESULTS |
Songs of ts cut birds at 60 d
Bilateral NXIIts (ts) transections do not disturb the respiratory
outputs involved in song production; thus, birds receiving ts cuts at
~30 d of age readily sang, but because they could not control their
syringeal musculature, they produced extremely abnormal songs by
60 d. These birds sang a series of simple syllables consisting of
harmonically related notes. These "harmonic stack" syllables had
little amplitude modulation (Fig.
3A), and the frequencies of
the stacks often fluctuated, giving the song a wavery quality. The song
of this ts cut bird (tsBOS) was very different from its tutor song
(Fig. 3B). Although the syllables of the tsBOS shown were
longer than normal, the average syllable and interval durations in ts
cut song were not significantly different from those of normal adult or
60 d song (p > 0.635 for all comparisons,
unpaired t tests). The song of a normal 60 d sibling of
the bird in Figure 3A is shown for comparison (Fig.
3C). Although this normal 60 d song had immature
features such as noisier syllables and a longer song duration than the
tutor, it had clear similarity in syllable morphology and timing to the
tutor song. Thus, the ts cut manipulation produced songs that were
considerably simpler than normal plastic song, and dramatically reduced
the similarity between BOS and tutor song that can occur by 60 d.

View larger version (59K):
[in this window]
[in a new window]
|
Figure 3.
NXIIts nerve transections minimized the similarity
between BOS and tutor song at 60 d. A, Sonogram and
oscillogram of the song of a ts cut bird at 60 d, which underwent
nerve transections at 29 d. Sonograms plot frequency versus time,
and the energy of each frequency band is indicated by its
darkness; oscillograms plot the amplitude of the song
waveform versus time. Calibration: A-C, 500 msec.
B, Tutor song of the ts cut bird in A.
Introductory notes (I) and syllables
(A, B) are labeled. C, Song of a 60 d juvenile whose tutor song is also shown in B.
Syllables that resemble those in the tutor song are labeled (i.e.,
syllable b in the juvenile song is similar to syllable
B in the tutor song). D, Measures of
similarity to tutor song from the matching task shown for different
bird groups. Black circles show the mean percentage of
observers that matched a song to the correct tutor song (left
ordinate). This mean averages the frequency of matching across
all songs in each song type. There is no percentage for random matches,
because their correct tutor was never present among the tutor song
choices. The mean spectral (white circles) and
temporal (white triangles) similarity scores are plotted
along the right ordinate. Error bars indicate SEM.
|
|
We quantified the decrease in similarity between tsBOS and tutor songs
using multiple methods of song analysis. In a matching task, observers
tried to match an experimental song (ts cut 60 d, normal 60 d, normal adult, or random) with that of its tutor, which was present
in a group of possible tutors (see Materials and Methods). Songs from
ts cut birds were correctly matched to their tutor song significantly
less frequently than were songs from normal 60 d birds (Fig.
3D) (unpaired t test, p < 0.002). Because NXIIts transections in adult birds are known to
preserve the overall timing of song but to eliminate normal spectral
features (Simpson and Vicario, 1990
; Williams and McKibben, 1992
),
observers also scored separately the spectral and the temporal
similarity between each experimental song and the chosen tutor song
(see Materials and Methods). The resulting mean spectral similarity score and the mean temporal similarity score for ts cut songs were
significantly lower than those for normal 60 d songs (Fig. 3D) (unpaired t test, p < 0.0001 for spectral similarity; p < 0.001 for temporal
similarity). For reference, the mean spectral and temporal similarity
scores for randomly matched songs (songs whose actual tutor song was
not present among the choices; for details, see Materials and Methods)
and for normal adult songs are also shown. Note that ts cut songs had
significantly lower spectral similarity scores than did the randomly
matched songs (unpaired t test, p < 0.002).
Comparison of LMAN neural responses with tsBOS and tutor song
Extracellular recordings of 52 LMAN neurons from 16 ts cut birds
revealed selectivity for tsBOS. Figure
4A shows a neuron that
responded substantially more to tsBOS than to tutor song, adult
conspecific song, and reverse tsBOS (a "mirror image" reversed song
in which both entire syllables and syllable sequence are reversed).
Thus, this neuron was sensitive to the spectral and temporal properties
of tsBOS, despite its simple structure. Many other neurons showed a
strong preference for tsBOS over tutor song. We quantified the
preference for tsBOS over tutor song for each neuron with a
d'tsBOS-tutor value (see Materials and Methods); neurons with d'tsBOS-tutor values
0.5 were considered to prefer tsBOS over tutor song, and neurons with
d'tsBOS-tutor values of
0.5 or less were
considered to prefer tutor over tsBOS. Classified in this way, 28% of
LMAN neurons preferred tsBOS over tutor song, and only 5% of neurons
preferred tutor over tsBOS (Fig. 4B). These strong
preferences for the abnormal tsBOS over tutor song demonstrate the
ability of BOS experience to shape LMAN neuron properties.

View larger version (29K):
[in this window]
[in a new window]
|
Figure 4.
LMAN selectivity for tsBOS at 60 d.
A, PSTHs show the greater response of a single LMAN
neuron to tsBOS than to tutor song, reverse tsBOS, and adult
conspecific song; 20 trials of each song were presented. For this
neuron, d'tsBOS-tutor = 1.50;
d'tsBOS-rev = 1.11; and
d'tsBOS-adult con = 1.41. B,
The cumulative distribution of tsBOS versus tutor song preferences for
all LMAN neurons recorded, as quantified with
d'tsBOS-tutor values, is shown with
white circles. For comparison, the distribution of
d'BOS-tutor values from normal 60 d
birds is shown with black circles. Gray
shading highlights those cells considered to respond
equally well to both songs. Inset, Mean RS of all LMAN
neurons recorded to BOS and tutor song for both ts cut (white
circles) and normal (black circles) 60 d
birds. Error bars indicate SEM.
|
|
Unexpectedly, many LMAN neurons responded equally well to tsBOS and
tutor song, despite the large acoustic differences between these two
songs. Figure 5A shows an
example of such a neuron, which came from a ts cut bird whose song was
matched to the correct tutor song by only one of nine observers. This
type of neuron represented a substantial proportion of LMAN neurons
recorded (Fig. 4B): 67% of the neurons had
d'tsBOS-tutor values between
0.5 and 0.5, thus classifying them as neurons with equivalent responses to both
tsBOS and tutor song. Overall, the mean of
d'tsBOS-tutor values of neurons from ts cut
birds was not significantly different from that obtained from normal
60 d birds (Fig. 4B) (unpaired t
test, p = 0.089; normal 60 d data from Solis and
Doupe, 1997
). On average, tsBOS elicited a greater response than tutor
song, as was true for LMAN neurons from normal 60 d birds (Fig.
4B, inset; paired t test,
p < 0.004 for neurons from ts cut birds; n = 46).

View larger version (48K):
[in this window]
[in a new window]
|
Figure 5.
Equivalent responses to tsBOS and tutor song.
A, PSTHs show the responses of a single LMAN neuron to
13 presentations of each song. Although this neuron responded
equally well to tsBOS and tutor song
(d'tsBOS-tutor = 0.14), it did not respond
well to adult conspecific song (d'tsBOS -adult
con = 1.13; d'tutor-adult con = 1.08) or to reverse tutor song (d'tutor-rev = 1.08). B, The tsBOS versus tutor song preference of
each LMAN neuron is compared with its selectivity by plotting
d'tsBOS-tutor values against
d'tsBOS-rev (black circles)
and d'tutor-rev (open
circles) values. Gray shading indicates those
neurons that responded equally well to tsBOS and tutor song. The
dashed vertical line marks the criterion for selectivity
(d' = 0.5) C, This histogram shows the
number of LMAN neurons classified as selective (black
bars) and unselective (hatched bars) in the
three different tsBOS versus tutor song preference categories.
D, For those neurons responding equally well to both
tsBOS and tutor song, histograms show paired comparisons between the
mean RS to tsBOS (black bars) or tutor song
(white bars) and the mean RS to adult conspecific song.
E, For those neurons responding equally well to tsBOS
and tutor song, histograms show paired comparisons between the mean RS
to tsBOS (black bars) or tutor song (white
bars) and the mean RS to their corresponding reverse songs. In
D and E, error bars indicate SEM, and
asterisks denote significant differences between each
pair of stimuli.
|
|
Neurons with equivalent responses to acoustically dissimilar tsBOS and
tutor songs might indicate that both song experiences shape the
selectivity of single neurons. There are alternative explanations for
such neurons, however. First, these neurons might not have exhibited a
stronger preference for tsBOS because they were tested with a version
of tsBOS that was not optimal for eliciting responses; the variability
of plastic song at 60 d makes this possible. Second, it is
possible that neurons with similar responses to tsBOS and tutor song
are simply immature: younger neurons from 30 d birds respond
equally well to all song stimuli (Doupe, 1997
). Third, the equivalent
responses to tsBOS and tutor song could be attributable to residual
similarities between the two songs. Although song analysis revealed
that, on average, tsBOS songs share little similarity with tutor song,
it is important to compare each bird's neural properties with the
similarity between its tsBOS and its tutor song. The first two
alternative explanations are discussed immediately below; the third
possibility will be examined in the last section of Results, using
detailed song analysis.
Plastic song renditions elicited equivalent neural responses
Because of plastic song variability normally present at 60 d,
it seemed possible that neurons without a strong tsBOS preference had
been presented with a version of plastic song to which neurons were less responsive. To assess this, LMAN neurons were presented with
three different renditions of tsBOS in eight experiments. Many neurons
responded equally well to all three renditions, whereas others
responded more to the tsBOS version most frequently produced by the
bird. This version was always used as the primary tsBOS, to which all
other songs were compared when measuring selectivity. Overall, there
was no significant difference in the responses elicited by the three
versions of tsBOS (ANOVA, p = 0.954; n = 21). Thus, it is unlikely that selectivity measurements were biased by inappropriate tsBOS presentation.
LMAN neurons with equivalent responses to tsBOS and tutor song were
not simply immature
Because AF neuron selectivity increases between 30 d
and adulthood (Doupe, 1997
), selectivity can be used to assay neuronal maturity. Two types of selectivity were analyzed to determine whether
neurons were immature. First, neural responses to tsBOS and tutor song
were compared with those to adult conspecific songs. Second, neural
responses to tsBOS and tutor song were compared with those to reversed
versions of these songs; for such reversed stimuli, both the entire
syllables and the sequence of syllables within the song were reversed.
Immature neurons would respond equally well to all of these stimuli
(Fig. 1C). When we analyzed the selectivity of individual
neurons with similar responses to tsBOS and tutor song, however, it was
clear that these neurons were not simply immature. For example,
although the neuron in Figure 5A responded strongly to both
tsBOS and tutor song, it did not respond well to adult conspecific or
reverse tutor song. Figure 5B further illustrates this
selectivity by plotting the tsBOS versus tutor song preference of each
neuron (indicated by its d'tsBOS-tutor value)
against a measure of selectivity (d'tsBOS-rev
and d'tutor-rev). Many neurons responding equally well to tsBOS and tutor song had d' values exceeding
0.5 for these measures of selectivity, indicating that they responded substantially more to tsBOS and tutor song than to reverse songs (Fig.
5B, points that lie within the gray zone
and to the right of the dashed vertical line).
Similarly, neurons with equivalent responses to tsBOS and tutor songs
still discriminated between these songs and adult conspecific song
(data not shown). Figure 5C shows the result of classifying
neurons as selective or unselective. We considered a neuron to be
selective if it had a d' value
0.5 for any one of four
selectivity categories: tsBOS-adult conspecific, tutor-adult
conspecific, tsBOS-reverse, and tutor-reverse. Classified in this
way, 66% of neurons responding equally well to tsBOS and tutor song
were selective. In comparison, 68% of this neuron type were classified
as selective in normal 60 d birds (Solis and Doupe, 1997
). Only 8 of 52 LMAN cells in the ts cut birds resembled 30 d neurons, with
similar responses to every song stimulus, and seven of these came from
the same animal.
Another measure of maturity is to consider the selectivity of a
population of neurons by averaging their responses to different song
stimuli. It is possible for individual neurons that do not themselves
meet the d' criterion for selectivity but whose responses are slightly biased toward selectivity to contribute to the selectivity of an entire population of cells. As a population, LMAN neurons with
similar responses to tsBOS and tutor song had greater RS on average to
tsBOS and tutor song than to adult conspecific (Fig. 5D) and
reverse songs (Fig. 5E) (paired t test, for
tsBOS-adult conspecific, p < 0.0001;
n = 27; for tutor-adult conspecific, p < 0.004; n = 27; for tsBOS-reverse, p < 0.0001; n = 26; for tutor-reverse,
p < 0.011; n = 21). Thus, using both
individual neuron and population measures, neurons with equivalent
responses to tsBOS and tutor song exhibited selectivity, unlike
immature neurons.
Alternative methods of measuring neural selectivity
In the previous analyses, comparisons of neural responses to
different stimuli can be affected by stimulus duration. A neuron's RS
to a stimulus was calculated by normalizing the number of spikes fired
during the stimulus by the stimulus duration. If neural responses
fatigue during presentation of a long stimulus, then this method will
result in an RS that is less than the neuron's initial firing rate to
the stimulus. This phenomenon can complicate comparisons between
responses to two songs when the song durations differ substantially.
For example, if two songs with large duration differences elicit the
same number of spikes from a cell, then the RS to the longer song will
be much less than that to the shorter song; d' measures,
which compare RS to two stimuli, would tend to favor the shorter of the
two stimuli. In this study, 7 of 19 experiments had substantial
differences between tsBOS and tutor song duration, in which one song
was at least twice as long as the other song. An example of the effect
of normalizing by song duration is shown in Figure
6A; the
d'tsBOS-tutor value obtained indicates a strong
preference for the shorter tutor song, yet the PSTHs show qualitatively
similar responses of an LMAN neuron to tsBOS and tutor song. When the
d'tsBOS-tutor values of individual neurons were
compared with the relative difference in duration between tsBOS and
tutor song, as expressed by the ratio (durationtsBOS
durationtutor)/(durationtsBOS + durationtutor), a strong correlation resulted
(r2 = 0.584; p < 0.0001); d'tsBOS-tutor values reflected a preference for the shorter of the two songs.

View larger version (25K):
[in this window]
[in a new window]
|
Figure 6.
Stimulus duration can influence the quantification
of tsBOS versus tutor song preference. A, PSTHs show the
responses of a single LMAN neuron to 12 presentations of tsBOS and a
short tutor song (602 msec). Although the responses appear equivalent,
the d'tsBOS-tutor value indicates a
preference for the shorter tutor song. B, The
distributions of peak d'tsBOS-tutor values
are shown for all LMAN (black) and X
(open) cells, regardless of stimulus duration.
C, For those cells responding equally well to tsBOS and
tutor song (according to their peak
d'tsBOS-tutor values), histograms show the
mean peak RS to different stimuli for LMAN (left
panel) and X (right panel) cells.
Paired comparisons show that tsBOS (black bars) and
tutor song (white bars) elicited greater average
responses than did adult conspecific song; asterisks
denote significant differences, and error bars indicate SEM.
|
|
To investigate the impact of this duration effect on the results so far
described, data from experiments in which tsBOS and tutor song had
similar durations (difference in duration was less than twice the
shorter song) were analyzed separately (30 LMAN neurons from 10 experiments). Within this data subset, the properties described for the
whole population persisted; some LMAN neurons preferred tsBOS over
tutor song, whereas others responded equally well to these two songs.
Among LMAN neurons responding similarly to tsBOS and tutor song, 93%
(13 of 14) were classified as selective, and on average this type of
neuron responded more to tsBOS and tutor song than to adult conspecific
and reverse songs (data not shown) (paired t test, for
tsBOS-adult conspecific song, p < 0.001; n = 14; for tutor-adult conspecific song,
p < 0.008; n = 14; for tsBOS-reverse,
p < 0.009; n = 13; for tutor-reverse,
p < 0.031; n = 11). Thus, the neuronal
properties present for the whole data set also described the subset of
data collected from experiments with similar tsBOS and tutor song durations.
Another method of removing stimulus duration bias from selectivity
measures is to obtain a peak firing rate for each stimulus. Peak firing
rate assesses a neuron's maximum response during a stimulus,
regardless of where it occurs in time. For each LMAN neuron, the
maximum firing rate occurring within a sliding 500 msec window was used
to calculate a peak RS to each stimulus (see Materials and Methods);
thus, every response was normalized by 500 msec, regardless of stimulus
duration. Peak d' values were then calculated using the peak
RS to different stimuli. The resulting peak
d'tsBOS-tutor values indicated that there were
still neurons that responded equally well to tsBOS and tutor song
(53%), and neurons that preferred tsBOS over tutor song (47%) (Fig.
6B). Of the neurons responding equally well to tsBOS
and tutor song, 63% were selective, as determined from their peak
d' values in the four selectivity categories. In addition,
the population of neurons with similar responses to tsBOS and tutor
song were also selective when their responses were measured using peak
RS; neurons responded on average significantly more to tsBOS and tutor
song than to adult conspecific (Fig. 6C) and reverse (data
not shown) songs (paired t tests: for tsBOS-adult
conspecific, p < 0.0001; n = 19; for
tutor-adult conspecific, p < 0.006; n = 19; for tsBOS-reverse, p < 0.0004;
n = 18; for tutor-reverse, p < 0.017;
n = 16). Thus, using peak RS and peak d'
values, neurons that responded similarly to tsBOS and tutor song were
still selective. Although peak d'tsBOS-tutor values reclassified 39% of LMAN neurons in terms of their tsBOS and tutor song preferences, the overall distribution was only slightly shifted toward tsBOS preference relative to the original d' tsBOS-tutor values (mean difference in
d'tsBOS-tutor = 0.22; paired t test,
p < 0.002; n = 43). Because there is
no duration difference between forward and reverse versions of the same
song, the maintenance of significant response differences between
forward and reverse versions of song with peak RS also indicates that
the 500 msec time window chosen was not too small to detect differences
between responses to different stimuli.
Thus, LMAN properties in ts cut birds were the same when (1) the
measurement of RS originally used in this and other studies was applied
to the whole data set, (2) the original RS was used for a data subset
comprising neurons collected from experiments without large duration
differences between tsBOS and tutor song, and (3) peak RS was used to
measure responses of the whole data set. For all three analyses,
neurons that preferred tsBOS over tutor song and neurons that responded
equally well to tsBOS and tutor song were apparent. The latter neurons
were also selective. The original measurement of RS will be used to
describe further LMAN properties in this study, because it has been
used in previous studies of AF neurons.
Selectivity of the entire population of LMAN neurons
Song and order selectivity
We also examined in detail the song and order selectivity of the
entire population of LMAN neurons, regardless of their tsBOS versus
tutor song preferences. By definition, song-selective neurons respond
more to tsBOS or tutor song than to other song stimuli, such as adult
conspecific and heterospecific songs. For the entire population of LMAN
cells recorded, song selectivity was apparent for both tsBOS and tutor
song. On average, both tsBOS and tutor song produced significantly
stronger responses than adult conspecific (Fig.
7A) and heterospecific songs
(Fig. 7B) (paired t tests: p < 0.0001 for tsBOS-adult conspecific; n = 45; and
tsBOS-heterospecific; n = 47; p < 0.004 for tutor-adult conspecific; n = 43;
p < 0.010 for tutor-heterospecific; n = 47). The song selectivity of individual LMAN neurons is illustrated
with scatterplots comparing each neuron's RS to tsBOS (Fig.
7D) or tutor song (Fig. 7E) with its RS to adult conspecific song. In both plots, the majority of cells lie below the
diagonal line, indicating their stronger responses to tsBOS or tutor
song than to adult conspecific song. The percentages of selective LMAN
cells in each song selectivity category are listed in Table
1.

View larger version (40K):
[in this window]
[in a new window]
|
Figure 7.
Song selectivity of the entire population of LMAN
neurons recorded in ts cut birds. Paired comparisons of mean RS show
that neurons responded more to tsBOS and tutor song than to adult
conspecific (A) and heterospecific song
(B). C, Paired comparisons also
show greater responses to tsBOS than to ts cut and normal 60 d
songs. In A-C, error bars indicate SEM, and
asterisks mark significant differences between song
pairs. D, The mean RS to tsBOS of each neuron is plotted
against its mean RS to adult conspecific song (adult
con). The diagonal line marks where cells lie if
the RS to the two stimuli were equal. Black circles
indicate those neurons with significantly greater responses to the
stimulus on the abscissa (p < 0.05, unpaired t test between abscissa
stimulus trials and all adult conspecific trials). E,
The mean RS to tutor song of each neuron is plotted against the mean RS
to adult conspecific song. Conventions are as in
D.
|
|
To test whether neurons were tuned specifically to tsBOS, rather than
to the noisy, immature features common to all plastic songs, other
plastic songs of ts cut and normal 60 d birds were presented. On
average, neurons responded more to tsBOS than to other plastic songs;
however, this reached statistical significance for only the
tsBOS-normal plastic song comparison (Fig. 7C) (paired t test, p < 0.0001 for tsBOS-normal
plastic; n = 32; p = 0.055 for
tsBOS-ts cut plastic; n = 28). Thus, LMAN neurons were
tuned to features specific to tsBOS.
As a population, LMAN neurons from ts cut birds were also
order-selective. A neuron is considered order-selective when it responds significantly more to forward song than to a song that is
completely reversed (see labels in Fig. 8A). On
average, LMAN neurons responded significantly more to tsBOS and tutor
song than to reversed versions of these songs (Fig.
8A) (paired
t test, p < 0.002 for tsBOS-reverse;
n = 42; p < 0.013 for tutor-reverse; n = 30). The order selectivity of individual LMAN
neurons is shown by plotting each neuron's RS to tsBOS (Fig.
8D) or tutor song (Fig. 8E) against
its RS to the corresponding reverse song. In these scatterplots, many
cells lie below the diagonal line, indicating their stronger responses
to tsBOS or tutor song than to the corresponding reverse song
stimuli.

View larger version (44K):
[in this window]
[in a new window]
|
Figure 8.
Order selectivity of the population of LMAN
neurons recorded from ts cut birds. A, Paired
comparisons of mean RS show that neurons responded more to tsBOS and
tutor song in the forward direction than to their respective reverse
songs. B, The mean RS to tsBOS and tutor song were
greater than those to reverse order versions of these songs.
C, The mean RS to tsBOS was greater than to the syllable
reverse version of tsBOS. In A-C, error bars indicate
SEM, and asterisks mark significant differences between
song pairs. D, The mean RS to tsBOS of each neuron is
plotted against its mean RS to reverse tsBOS. The diagonal
line shows where cells lie when they respond equally to the two
stimuli compared. Black circles indicate those cells
that had significantly greater RS to the stimulus on the
abscissa than to the reverse manipulation
(p < 0.05, unpaired t test
between forward song trials and corresponding reverse song trials).
E, The mean RS to tutor song of each neuron is plotted
against its mean RS to reverse tutor song. Conventions are as in
D.
|
|
Features important to order selectivity
To test the importance of syllable sequence within a song for
order selectivity, "reverse order" stimuli were presented. Reverse order songs maintain the temporal order within individual syllables but
reverse the syllable sequence within a song (see labels in Fig.
8B). On average, cells responded significantly more
to forward tsBOS and tutor song than to reverse order versions of these
songs (Fig. 8B) (paired t test, for
tsBOS-reverse order, p < 0.010; n = 29; for tutor-reverse order, p < 0.050;
n = 33). Thus, cells were sensitive to the
syllable sequences within tsBOS and tutor song.
Because of the simple harmonic stack structure of syllables in many
tsBOS, it seemed possible that neurons would be insensitive to reversal
of the temporal structure within syllables from tsBOS. To test the
contribution of individual syllable structure to order selectivity for
tsBOS, we also presented "syllable reverse" stimuli. Syllable
reverse stimuli maintain the correct syllable sequence within a song
but reverse the individual syllables (see labels in Fig.
8C). On average, cells responded significantly more to forward tsBOS than to syllable reverse tsBOS (Fig. 8C)
(paired t test, p < 0.003;
n = 13). Thus, cells were also sensitive to the
temporal structure within the simpler tsBOS syllables. The percentage
of selective neurons in each order selectivity category is listed in
Table 1.
Comparison of X neural responses to tsBOS and tutor song
X is the first nucleus in the AF pathway; it receives
inputs from HVc and itself projects to DLM, which in turn goes to LMAN. In addition, X receives feedback via projections from LMAN. To understand the circuitry underlying AF selectivity and potential interactions between LMAN and X, 64 single X neurons were also recorded
from 19 ts cut birds.
As in LMAN, some X neurons responded more to tsBOS than to tutor song.
The neuron in Figure 9A not
only strongly preferred tsBOS over tutor song, but it also preferred
tsBOS over adult conspecific song and reverse tsBOS. In addition, many
X neurons responded equally well to tsBOS and tutor song, despite
the acoustic dissimilarity of these songs; an example of such a
neuron is illustrated in Figure
10A. The distribution
of d'tsBOS-tutor values from individual X
neurons is shown in Figure 9B: 37% of X neurons recorded preferred tsBOS over tutor song; 35% responded equally well to tsBOS
and tutor song; and 28% preferred tutor song over tsBOS. This
distribution did not differ significantly from that obtained from X
neurons in normal 60 d birds (unpaired t test,
p = 0.711; normal 60 d data from Solis and Doupe,
1997
). On average, in ts cut birds, neural responses to tsBOS were not
significantly different from those to tutor song (Fig. 9B,
inset) (paired t test, p = 0.862;
n = 63).

View larger version (33K):
[in this window]
[in a new window]
|
Figure 9.
Selectivity for tsBOS in X. A,
PSTHs show the responses of a single X neuron to 20 presentations of
each stimulus. This neuron responded more to tsBOS than to tutor song,
reverse tsBOS, and an adult conspecific song
(d'tsBOS-tutor = 1.52;
d'tsBOS-rev = 0.80; and
d'tsBOS-adult con = 2.35). The
dashed white line indicates the neuron's average
spontaneous firing rate. Note that the ordinate of the
PSTHs begins at 10 spikes/sec. B, Cumulative
distributions of d'tsBOS-tutor values of
individual X neurons from ts cut birds (white circles)
and normal 60 d birds (black circles).
Inset, Mean RS to BOS and tutor song of the population
of X neurons recorded from ts cut (white circles) and
normal (black circles) 60 d birds. Error bars
indicate SEM.
|
|

View larger version (50K):
[in this window]
[in a new window]
|
Figure 10.
Some X neurons responded equally well to tsBOS
and tutor song. A, PSTHs made from the responses of a
single X neuron to 10 presentations of each stimulus. This neuron
responded more to tsBOS and tutor song than to reverse tsBOS or adult
conspecific. For the responses shown,
d'tsBOS-tutor = 0.43;
d'tsBOS-rev = 1.43;
d'tsBOS-adult con = 1.06; and
d'tutor-adult con = 0.96. The white
dashed line indicates the neuron's average spontaneous firing
rate. Note that the ordinate of the PSTHs begins at 10 spikes/sec. This particular tsBOS was matched to the correct tutor song
by only one of nine observers. B, The
d'tsBOS-tutor value of each X neuron is
plotted against two measures of selectivity:
d'tsBOS-adult con (black
circles) and d'tutor-adult con
(open circles). The gray region
highlights those neurons considered to have responded equally well to
tsBOS and tutor song. The dashed vertical line marks the
criterion for selectivity (d' = 0.5). C,
Number of X neurons classified as selective (black bars)
and unselective (hatched bars) in the three different
tsBOS versus tutor song preference categories. D, For
those neurons responding equally well to both tsBOS and tutor song,
histograms show paired comparisons of the mean RS to tsBOS
(black bars) or tutor song (white bars)
to adult conspecific song. E, For those neurons
responding equally well to tsBOS and tutor song, histograms show paired
comparisons of the mean RS to tsBOS (black bars) or
tutor song (white bars) and their corresponding reverse
songs. In D and E, error bars indicate
SEM, and asterisks denote significant differences
between each pair of stimuli.
|
|
Plastic song renditions and neuronal maturity
Neurons that did not strongly prefer tsBOS were unlikely to have
resulted from inappropriate tsBOS choice; X neurons responded equally
well to three different renditions of tsBOS (ANOVA, p = 0.079; n = 38). Furthermore, neurons with similar
responses to tsBOS and tutor song were also selective, indicating that
they were not immature. For example, the neuron in Figure
10A responded strongly to both tsBOS and tutor song
but substantially less to conspecific song and reverse tsBOS. The song
selectivity of neurons that responded equivalently to tsBOS and tutor
song was examined by plotting the d'tsBOS-tuto