 |
Previous Article | Next Article 
The Journal of Neuroscience, December 1, 1999, 19(23):10461-10481
Singing-Related Neural Activity in a Dorsal Forebrain-Basal
Ganglia Circuit of Adult Zebra Finches
Neal A.
Hessler and
Allison J.
Doupe
Keck Center for Integrative Neuroscience, Departments of Physiology
and Psychiatry, University of California, San Francisco, California
94143-0444
 |
ABSTRACT |
The anterior forebrain pathway (AFP) of songbirds, a specialized
dorsal forebrain-basal ganglia circuit, is crucial for song learning
but has a less clear function in adults. We report here that neurons in
two nuclei of the AFP, the lateral magnocellular nucleus of the
anterior neostriatum (LMAN) and Area X, show marked changes in
neurophysiological activity before and during singing in adult zebra
finches. The presence of modulation before song output suggests that
singing-related AFP activity originates, at least in part, in motor
control nuclei. Some neurons in LMAN of awake birds also responded
selectively to playback of the bird's own song, but neural activity
during singing did not completely depend on auditory feedback in the
short term, because neither the level nor the pattern of this activity
was strongly affected by deafening. The singing-related activity of
neurons in AFP nuclei of songbirds is consistent with a role of the AFP
in adult singing or song maintenance, possibly related to the function
of this circuit during initial song learning.
Key words:
birdsong; songbird; LMAN; Area X; feedback; deafening; reafferent; variability
 |
INTRODUCTION |
The singing of birds is a temporally
complex behavior (see Fig. 1A) that is crucial in
courtship and territorial contexts (for review, see Catchpole and
Slater, 1995 ). Like human speech, song is learned early in an
individual's life, by listening to and copying the vocalizations of
adults. In both songbirds and humans, vocal output is strongly
dependent on access to auditory feedback of the individual's own
vocalizations, especially during learning and to a somewhat lesser
extent in the mature individual (Konishi, 1965 ; Nordeen and Nordeen,
1992 ; Leonardo and Konishi, 1999 ). Finally, like humans, songbirds have
evolved specialized forebrain circuitry for producing complex vocalizations.
In songbirds, this circuitry, which is not found in closely related
species that do not learn to sing, is known as the song system (for
review, see Brenowitz, 1997 ; Doupe and Kuhl, 1999 ). Nuclei in the motor
pathway (see Fig. 1B, gray) are essential throughout life for normal song production. Their activity during singing is highly patterned (McCasland, 1987 ; Yu and Margoliash, 1996 ),
and disruption of their activity alters or abolishes song output
(Nottebohm et al., 1976 ; Vu et al., 1994 ). In contrast, a second
circuit of forebrain song nuclei has a less clear function in adult
motor production but, like hearing, plays a particularly critical role
during learning. This anterior forebrain pathway (AFP) (see Fig.
1B, black) links the motor pathway nuclei
HVc and robust nucleus of the archistriatum (RA), but via an indirect loop through basal ganglia (Area X), thalamus [the medial portion of
the dorsolateral thalamus (DLM)], and a cortex-like nucleus, the
lateral magnocellular nucleus of the anterior neostriatum (LMAN)
(Okuhata and Saito, 1987 ; Bottjer et al., 1989 ; Luo and Perkel,
1999a ,b ). Lesions or pharmacological inactivation of AFP nuclei
markedly disrupt initial song learning and production but have no
obvious effect on stable "crystallized" song of adult zebra finches
(Bottjer et al., 1984 ; Sohrabji et al., 1990 ; Scharff and Nottebohm,
1991 ; Basham et al., 1996 ). Moreover, in anesthetized birds, neurons in
these nuclei respond more strongly to playback of the bird's own song
than to songs of conspecific birds, and this neural selectivity
develops over the course of song learning (Doupe, 1997 ; Solis and
Doupe, 1997 , 1999 ). Thus, results of both lesion and physiological
studies support a model in which the AFP is involved in the evaluation
of self-produced vocalizations that occur as the bird learns to produce
a copy of the song memorized early in life.
Although the AFP clearly is crucial during juvenile song learning, its
function in adult finches is less apparent. Recent experiments,
however, have raised several possibilities. First, the maintenance of
adult zebra finch song by auditory feedback (Nordeen and Nordeen, 1992 ;
Leonardo and Konishi, 1999 ) could be mediated by the AFP. Furthermore,
the AFP may be involved in the perception and classification of
conspecific songs (Scharff et al., 1998 ) and in the modulation of
behavior by social context (Jarvis et al., 1998 ; Hessler and Doupe,
1999 ). To investigate these possible functions, we undertook to
describe in detail the activity of the AFP nuclei in awake adults. We
report here that LMAN and Area X are strongly active during singing and
that this activity has many features in common with the singing-related activity of the motor nuclei HVc and RA (McCasland, 1987 ; Yu and Margoliash, 1996 ). Moreover, although we found that neurons in some
birds can show selective auditory responses to playback of songs,
initial experiments on the effects of manipulating hearing during
singing indicate that much of the singing-related activity does not
depend on hearing. These results suggest that the AFP, although it is
not required for adult song production, nevertheless has some function
in this complex learned behavior, perhaps serving in part as an
"efference copy" of singing motor commands from the song system
motor nuclei.
 |
MATERIALS AND METHODS |
Animals. Adult (>125 d old) male zebra finches
(Taeniopygia guttata) were used for experiments. One bird
was purchased from a local supplier, and the rest were raised from
hatching in our colony. The care and treatment of experimental animals
was reviewed and approved by a university animal care and use committee
at the University of California, San Francisco (UCSF). Birds were selected for recordings on the basis of size and singing frequency and
were then isolated in a small cage inside a sound-attenuation chamber
(Acoustic Systems, Austin, TX) and supplied with food, water, and grit
ad libitum, with occasional supplements of boiled chicken
egg. Several days after spontaneous singing occurred frequently in this
chamber, a chronic recording apparatus was implanted.
Surgical procedures. Birds were anesthetized with an
intramuscular injection of 30-40 µl equithesin (0.85 gm chloral
hydrate, 0.21 gm pentobarbitol, 0.42 gm MgSO4,
2.2 ml 100% ethanol, and 8.6 ml propylene glycol in
H2O to a total volume of 20 ml) and placed in a
stereotaxic device stabilizing the head at the beak and ear canals. The
beak was positioned at an angle of 50° from vertical. The position of
the posterior border of the divergence of the central sinus at the
boundary of the forebrain and the cerebellum was noted. A small hole
was made in the skull at a specific position relative to this reference
coordinate: for LMAN and Area X, 5 mm anterior and 1.7 mm lateral; for
HVc, 0.0-0.2 mm anterior and 2.0-2.2 mm lateral; for RA, 0.2-0.7 mm
anterior and 2.1 mm lateral. All recordings were made from nuclei in
the right hemisphere. A previous study has reported that the pattern of
singing-related activity in the right and left HVc is similar (McCasland, 1987 ). A lightweight (~1 gm) microdrive (UCSF Physiology Shop) carrying an insulated tungsten electrode of impedance 2-5 mOhm
(either AM-Systems, Carlsborg, WA, or FHC, Bowdoinham, ME) was
positioned stereotaxically such that the electrode tip was ~500 µm
above LMAN or RA or ~300 µm above HVc. A reference ground electrode
(uninsulated tungsten electrode, AM-Systems) was implanted in the
contralateral (for LMAN and Area X) or ipsilateral (for HVc and RA)
hemisphere such that it passed within ~2 mm of the targeted nucleus.
The microdrive and its connector socket (FHC) were secured to the skull
with epoxy (DevCon, Wood Dale, IL) and dental cement (Dentsply,
Milford, DE), and a protective cap (3 M, St. Paul, MN) was
fixed around it (see Fig. 1C).
Deafening. Deafening was performed in two stages, by the
method of Konishi (1965) . Before implantation of the microdrive and electrode, birds were anesthetized with an injection of equithesin or
by inhalation of isofluorane. Feathers occluding the right external
auditory meatus were plucked, and the skin flap within the ear canal
was cut off to expose the tympanic membrane. The tympanic membrane was
detached from the columella, and the columella was removed. A hook of
tungsten wire was inserted through the oval window and withdrawn with
the attached cochlea. The extracted cochlea was examined under a
dissecting microscope to ensure that the entire structure had been
successfully removed, and the skin incision was closed with
cyanoacrylate adhesive. Several weeks later, after the skin inside the
right ear canal had regrown, the bird could be positioned in the
stereotaxic device using ear bars, and the microdrive-electrode was
implanted. After several days of recording in LMAN, the left cochlea
was removed as above, to complete the deafening.
Physiological recording. During each recording session, one
end of a flexible lead terminating in a small operational amplifier circuit (TLC27L7C, Texas Instruments, Dallas TX) (Buzsaki et al., 1989 )
was connected to the small socket on the bird's head, and the other
end was connected to a rotating commutator (H. Adams, Caltech Machine
Shop, or Radio Shack, Fort Worth, TX). The neural activity signal
passed through the commutator to a differential amplifier (A-M
Systems), where it was filtered between 300 Hz and 5 kHz. The acoustic
signal in the sound box was recorded by a small microphone (Radio
Shack) adjacent to the cage. After several days of adaptation to being
attached to recording leads, most birds began singing while they were
isolated in the sound box. During experiments, the electrode was
lowered to a position where large action potentials of single or
multiple neurons could clearly be differentiated from background neural
activity (larger spike amplitudes in each recording session ranged from
~300 µV to >1 mV, peak to peak). The bird's behavior was
monitored via a video camera inside the sound box, and the video,
neural, and acoustic signals were archived on videotape. A computer
program (developed by C. Malek, Caltech, and C. Roddey, UCSF) monitored
the sound amplitude inside the sound box and triggered the recording to computer disk of the acoustic and neural signals (both digitized at 32 kHz) for an adjustable period before and after passage of the sound
amplitude over a threshold level. Previously recorded versions of the
bird's own song, and in some cases songs of other zebra finches, were
played from a small speaker in the sound box during periods when the
bird was not singing. The volume of song playback from the speaker was
set such that its intensity in the vicinity of the bird's cage and the
amplitude of sound produced by the bird were approximately equal. The
intensity of both signals was monitored by a microphone immediately
adjacent to the cage (~10 cm from the bird).
Recordings were made at intervals of 1 d to 1 week, over periods
of 2 weeks to 2 months. After most recording sessions, the electrode
was retracted to a position above the nucleus. For each successive
recording session, the electrode was advanced by at least 80 µm
beyond the previous recording depth. After completion of recordings,
small electrolytic lesions (20 µA for 5 sec) were made at previously
recorded sites. Animals were deeply anesthetized with Metofane
(Pitman-Moore, Mundelein, IL) and intracardially perfused with 0.9%
saline, followed by 3.7% formalin. Lesions were localized in 40 µm
Nissl-stained brain sections. Locations of all recording sites were
confirmed to be within song nuclei by position relative to depth of
marker lesions.
Behavioral analysis. Zebra finches, like other songbirds,
produce several distinct types of vocalizations, which have been extensively characterized in behavioral studies (for review, see Zann,
1996 ). Here, we have limited our analysis to "undirected song,"
which birds produced while out of visual contact with other birds. In
zebra finches, undirected song contains a variable number of
motifs, a stereotyped series of approximately 3-10 discrete vocal elements (syllables) (see Fig.
1A for examples of song
components), preceded by several simple introductory elements. Song
initiation was defined by the onset of the first introductory element
or song syllable with a <300 msec interval between its offset time and
the onset time of the next song element (further introductory element
or motif syllable). Song termination was defined by the offset time of
a syllable that was followed by at least 300 msec before the next
vocalization (~1000% of average intersyllable interval duration;
data not shown). This interval was lengthened for one bird that had an
exceptionally long intersyllable interval within its motif. For
characterization of presinging activity level, only song initiations
(as defined above) before which the bird had been silent for at least 3 sec were used. For characterization of postsinging activity level, only
song terminations after which the bird was quiet for at least 3 sec
were used. Thus, for many songs, the presong or postsong period or both
were not included in the analysis of activity. Background nonsinging
activity was recorded during periods in which no sounds were produced;
such periods were required to follow any vocalization by at least 3 sec
and to precede any vocalization by at least 3 sec.

View larger version (77K):
[in this window]
[in a new window]
|
Figure 1.
Singing behavior, neural circuitry, and
experimental subjects. A, A spectrogram (plot of
frequency vs time, with loudness indicated by the darkness of the
signal) of a zebra finch song shows the characteristic features of
song. Stereotyped sequences of syllables (lower case
letters), called "motifs" in zebra finches (indicated by
dark bars), may be sung from one to several times in
succession, preceded by a variable number of short introductory
elements (i). Amplitude oscillogram of song is
plotted below spectrogram. Audio file of song is located at
http://www.keck.ucsf.edu/~neal/jns99/fig1.wav. B, The
song system is a network of discrete nuclei involved in song learning
and production. Nuclei in the motor pathway
(gray) are critically involved in song
production, whereas nuclei in the AFP (black) are
necessary for song learning but not for production of crystallized
adult song. C, Photograph of a representative
experimental bird during a recording session. A small microdrive
(approximate dimensions: 15 mm high, 6 mm long, 3 mm wide) is inside a
protective polyurethane cap on top of the bird's head; flexible
copper-strand and silver wires terminate at the bird's head in a small
op-amp circuit.
|
|
Analysis of acoustic and neural signals. All analysis was
performed on digitized acoustic and neural signals using Matlab (Mathworks, Boston, MA). For classification of vocalizations, the sound
waveform was first compressed to speed analysis routines. After
filtering (0.5-8 kHz bandpass), the sound waveform was rectified and
smoothed with a 1 msec (SD) Gaussian function and resampled at 1 kHz
(see, for example, Fig. 5C). Individual song components were
readily identifiable by visual inspection of this signal. To locate
productions of a specific pattern of syllables in a series of data
files, a template was made consisting of the rectified song waveform
representing the song elements (typically a template consisted of
approximately three syllables of ~300 msec duration, including some
chosen because of their complex temporal structure). For each data
file, this template was iteratively subtracted from the rectified sound
waveform, at steps of 5 msec. Close matches of the template to the test
sound resulted in local minima in the resulting error function. The
accuracy of motif detection by alignment to such minima was confirmed
by visual inspection. This alignment was used for subsequent analysis
of corresponding neural activity. Digitized neural activity waveforms
were rectified and smoothed with a 10 msec (SD) Gaussian function and
resampled at 1 kHz (see, for example, Fig. 2C). This signal
was used in most analyses of LMAN activity level.
To estimate the onset time of the mean activity peak located near song
initiation, a line was first fitted (by the least-squares method) to
the linearly rising phase of the activity peak, beginning after the
smaller slope of the slowly rising phase of activity level was replaced
by the larger slope during the sharp rise in activity level, and ending
before the time at which the slope of the activity level began
decreasing at its peak. The onset time was then estimated by
determining the intersection of this fitted line with the average
background activity level.
To quantify the temporal relationship between LMAN neural activity
level and song components (see Fig. 10), for each motif rendition the
rectified sound waveform was converted into a step function with
syllable presence indicated by a 1 and absence by a 0. We then
calculated the cross-covariance of the neural activity trace with
the discretized song (i.e., cross-correlation after normalization of
means of both sequences; the cross-covariance was normalized such that
auto-covariance of a signal with itself at zero lag equals 1.0). The
cross-covariance was calculated with the neural activity signal shifted
from 100 to +100 msec relative to the song signal, at steps of 1 msec.
For several recordings in LMAN and all recordings in Area X, the firing
of single units or clusters of several neurons was analyzed by off-line
sorting of units from neural activity traces. Sorting was performed by
counting waveform events that passed below a negative threshold and
then above a positive threshold within 0.5 msec. Accurate isolation of
single units was verified by visual inspection and by examination of
interspike intervals.
Results for all analyses are presented as mean ± SD.
 |
RESULTS |
Singing-related neural activity in LMAN
Overall relationship of LMAN firing to song production
Neural activity was recorded in LMAN of 11 adult zebra finches
during both song production and nonsinging periods. In all birds, a
conspicuous change in multiunit firing occurred during singing. In a
representative recording, activity increased markedly when a bird
produced each of three successive songs (Fig.
2A,B). Beneath the raw neural activity signal is plotted the smoothed rectified waveform (Fig. 2C), which reflects overall
multiunit activity level (see Materials and Methods for details).
Activity level is clearly different between singing and quiet periods. Additional characteristics of singing-related activity can be seen in
an expanded view of the final 4 sec from Figure
2A-C (Fig. 2D,E). The complex spectral and
temporal structure of this bird's song, which includes two iterations
of a multinote syllable (*), is evident in the spectrogram (Fig.
2D). The neural activity trace (Fig.
2E) shows condensed bursts of activity, as well as
inhibition of background neural activity; both of these features were
typically associated with singing.

View larger version (32K):
[in this window]
[in a new window]
|
Figure 2.
Neural activity in LMAN associated with song
production. A, Amplitude oscillogram of sound recorded
in a 14.5 sec period during which the bird mx-17 produced three songs.
B, Simultaneously recorded neural activity in LMAN.
C, Rectified waveform of the neural activity in
B smoothed with a 10 msec Gaussian function.
D, Spectrogram of the final 4 sec of singing from
A (epoch indicated by scale bar in C).
E, The neural activity waveform associated with the song
in D highlights the characteristic burstiness of
singing-related firing; the asterisks denote two
renditions of the same complex syllable. Audio file of songs in
A is located at
http://www.keck.ucsf.edu/~neal/jns99/fig2.wav.
|
|
Singing-related LMAN activity for two additional representative birds
is shown in Figure 3. As in the recording
shown in Figure 2, the level of neural activity during singing both
rises to higher peaks and falls to equal or lower minima than it does
during nonsinging periods (mean background activity level is indicated
by the dotted lines). To summarize these data
quantitatively, we compared the distribution of neural activity levels
during singing to that during background nonsinging periods (the
smoothed rectified waveform was sampled at 1 msec intervals). Epochs of
recording sessions were used in which the amplitude of neural activity
during singing periods was relatively stationary; that is, there was no
overall drift in spike amplitudes. For example, note the lack of change in response amplitude between the two songs displayed for each bird in
Figure 3 that were selected from the first and last 10% of the
recording session (duration of recording used for dc-12 was 3 hr
and for dc-18 was 1 hr). For each of these two birds, the 10th, 50th,
and 90th percentiles of the activity distributions during the entire
recording session for nonsinging (bg: background, sampled
from quiet periods) and singing (sing) periods are displayed immediately to the right on the same ordinate. The median and range of
neural activity levels are clearly different between nonsinging and
singing periods. To the right of these distribution plots, activity
level distributions from additional recording sessions at different
sites within LMAN of the same birds are plotted. Note the similarity in
range and medians of the activity distributions for the three recording
sessions for dc-12 (top) and the two sessions for dc-18 (bottom).

View larger version (26K):
[in this window]
[in a new window]
|
Figure 3.
Singing-related activity is seen consistently in
all recordings from LMAN. A, Rectified and smoothed (10 msec Gaussian window) waveforms of neural activity recorded during one
session for bird dc-12 and one session for bird dc-18. For both
sessions, in the left panel the pattern of activity
before, during, and after a song bout from the first 10% of the
recording session (early) is plotted; in the
right panel activity is plotted in the same way for
songs from the final 10% of the session (late).
Bars to the immediate right depict the
distribution of activity levels (sampled at 1 msec intervals) during
nonsinging background (bg) and singing
(sing) periods across the entire recording session.
Top and bottom ticks denote the 90th and
10th percentiles of these distributions, whereas middle thick
ticks denote medians. Values for both distributions have been
normalized by the mean background level. The total duration of singing
used was 108 sec for dc-12 and 270 sec for dc-18. Activity level
distributions for two (dc-12) and one
(dc-18) additional recording sessions at different sites
in LMAN are plotted on the same abscissa to the far
right. B, Mean activity level during singing
divided by mean activity level during background period for all
recording sessions from each bird (each symbol
represents a different bird; some symbols have been
offset vertically to reduce overlap). C, coefficient of
variation (c.v.) of activity level during
singing divided by c.v. of activity level during background period for
each recording from every bird (each symbol represents a
different bird). For comparisons of activity in B and
C, the mean duration of singing was 158 ± 168 sec
(range 16-802 sec) and of background was 136 ± 157 sec (range
30-795 sec).
|
|
To quantify the effect of singing on overall activity level for all
birds, the average activity level in successive 1 sec epochs of
background and singing periods was calculated (the duration of the
shortest songs was ~1 sec). In 27 of 27 recordings from 11 birds, the
level of activity in LMAN was higher during singing than during
nonsinging periods (p < 0.01 for all unpaired
t tests for each recording session), with an average
increase of 132 ± 8% (SD) (Fig. 3B). In all recording
sessions, modulation of activity also increased during singing (Figs.
2, 3A). This effect was quantified by calculating the
coefficient of variation (c.v.), the SD of the values of an
activity distribution normalized by the mean activity, during singing
and nonsinging periods. Paired comparisons within recording sessions
revealed a significantly larger c.v., and thus increased modulation, of
neural activity levels during singing compared with that during
background periods (p < 0.01, sign test) (Fig.
3C). Across all recordings, the mean c.v. for background
nonsinging periods was 0.11 ± 0.02, and for singing periods the
c.v. was 0.22 ± 0.05.
For the recording sessions shown in Figures 2 and 3A, we
also quantified activity level by calculating the multiunit spike rate,
using off-line window discrimination (for details, see Materials and
Methods). When spike arrival times were convolved with a 10 msec
Gaussian function, as was done for the rectified neural waveforms, the
distributions of activity levels for background and singing periods
were similar to those quantified by rectifying and smoothing activity
waveforms. The magnitude of increase in activity level from background
to singing periods for these three recording sessions was similar when
quantified by these two methods: 1.17 versus 1.25, 1.31 versus 1.26, and 1.38 versus 1.28 for rectified waveform and sorted multiunit spikes
respectively. Thus, as a measure of overall population activity level,
the rectified and smoothed activity waveform is approximately
equivalent to sorted multiunit spikes.
Activity level in LMAN rises before and is depressed
after singing
In addition to its increase during singing, the level of activity
was consistently altered before and after singing in all recordings
from LMAN. This effect, which was evident in Figures 2 and
3A, is shown in Figure
4A-C in a
different manner for three representative birds, by plotting LMAN
activity level during multiple presong and postsong epochs. To limit
the possible influence of previous or subsequent singing, we examined
only song initiations before which the birds had been silent for >3
sec, and only song terminations after which birds refrained from
vocalizing again for >3 sec (for details, see Materials and Methods).
In the top section of each panel, the multiunit activity level during
successive initiations and terminations, as quantified by rectified
neural waveforms, is plotted on a gray scale, with darker shades
representing high activity levels. These plots illustrate the
consistent rises in activity level before song initiations and the
clear diminution of activity after song terminations.

View larger version (44K):
[in this window]
[in a new window]
|
Figure 4.
LMAN activity is consistently altered before song
initiations and after song terminations. A-C,
Left and right panels depict LMAN
multiunit activity level during presinging and postsinging intervals
for three representative birds. For each, the top gray scale
panels plot activity level over time (left to
right) in relation to successive song renditions
(horizontal rows). Activity level has been quantified by
rectifying and smoothing raw activity waveforms (see Materials and
Methods). Increases in activity level are represented by darker shade,
and decreases are represented by lighter shade. Times of song
initiation and termination (by which activity traces for
left and right panels, respectively, were
aligned) are indicated by arrows above and below gray
scale panels. Mean activity level (normalized to background nonsinging
level) across all renditions is plotted below the gray scale plots,
along with dotted lines indicating 70th (left
panels) and 30th (right panels) percentiles of
background activity level distributions; background activity was
recorded during periods interspersed among presong and postsong
recordings. Asterisks underneath mean activity level
traces in left panels indicate calculated onset time of
sharp activity peaks. Lower and upper limits of gray scale bars are
0.75 and 2.0, 0.25 and 2.75, and 0.75 and 1.9 for A,
B, and C, respectively. D,
Summary of timing and amplitudes for average presinging and postsinging
parameters: a, time at which mean activity level rose
above the 70th percentile of background activity level distribution for
the 10 of 15 recording sessions in which this occurred before the
peri-initiation peak. b, Peri-initiation peak timing and
magnitude relative to background ( ), connected by a
line to the onset latency at which this peak rose from
background
|
|
These features of the relationship of activity to song initiation and
termination are evident in plots of the mean activity level across all
renditions (Fig. 4A-C, bottom
panels). In all recordings, a prominent peak of activity was
located very near song onset (peri-initiation peak). Initiation
latencies of these peaks for the three examples in Figure 4 are
indicated by an asterisk beneath the presinging mean activity trace
(latencies estimated by fitting a line to the peak's rising phase; for
details, see Materials and Methods). Across all recordings, the mean
onset latency of the peri-initiation peak relative to song initiation was 70 ± 24 msec, with the average peak maximum located 9 msec before song onset (Fig. 4Db) (24 recording sessions
from 10 birds; mean 56 song initiations per session; for recordings
with fewer than 10 songs in which bird was quiet for the preceding 3 sec, songs in which the bird was quiet for at least 1 sec were used; data from one bird for which there were fewer than 10 such songs were
not included in this analysis).
Along with the sharp rise of activity near song onset, in many
recordings a gradual rise in activity level also preceded song initiation. To compare this feature across recordings, we calculated the time at which the mean activity level rose and remained above the
70th percentile of the background activity level distribution (represented by the dotted line on the left side
of the bottom panels in Fig.
4A-C). For these measurements, we
selected song initiations that were preceded by at least 2.5-3 sec of
silence: 15 recording sessions contained more than 10 such songs. The
mean activity level in 10 of 15 of these recordings rose above the 70th
percentile of background activity level before the immediate presong
peak, at an average latency of 419 ± 137 msec (Fig.
4Da) (for recording sessions in Fig.
4A,B, latencies were 331 and 352 msec, respectively). The mean activity level in the remaining five
recording sessions did not rise above background levels until the immediate peri-initiation peak, as in the example shown
in Figure 4C.
In contrast to the presinging activity increases, firing was
consistently depressed after song termination and required from 1 sec
to several seconds to return to nonsinging levels (Fig. 4A,C, right panels).
Across all recordings, activity level reached a minimum amplitude of
0.77 ± 0.11 relative to background at 198 ± 99 msec after
song termination (Fig. 4Dc) (16 recordings from eight
birds; data from three birds for which there were fewer than 10 quiet
postsong intervals were not analyzed). Following this minimum, an
average of 1172 ± 407 msec was required for activity levels to
recover to over the 30th percentile of background activity level
(represented by the dotted line in right panels
of Fig. 4A-C; for recording sessions in
A-C, latencies were 1485, 1479, and 314 msec, respectively).
These presinging and postsinging changes in activity level were
extremely consistent; for all birds they occurred before and after
almost every song. For example, in the recording session shown in
Figure 4A, the mean activity level in a period from
50 msec preceding song to song onset was, for every rendition, higher than in a period from 2 to 1.5 sec before singing (Fig.
4E, ). Conversely, the activity level in the
immediate postsong period (averaged in a 50 msec window around the
location of the minimum of the mean) was consistently lower than that
from 1.5 to 2 sec after song termination (Fig. 4E,
). For this and other recordings in which at least 10 presinging and
post singing renditions were obtained, activity level was significantly
higher just before song onset than 2 sec before singing and was
significantly lower just after song termination than 2 sec later
(comparisons were made as in Fig. 4E;
p < 0.01 for all paired t tests comparing presong and postsong periods in each of 16 recording sessions from
eight birds).
LMAN firing during repeated renditions of a
stereotyped song motif
In addition to characterizing the general association of firing to
song production, we examined the relationship between repeated song
elements and patterns of neural activity. In zebra finch song,
sequences of several syllables are produced in a stereotyped order to
form a song motif, which can then be repeated a variable number of
times (Fig. 1A). Here, by using a template-matching algorithm, we detected multiple renditions of stereotyped song motifs
during single recording sessions and analyzed the pattern of neural
activity associated with each motif (see Materials and Methods for
details of detection protocol).

View larger version (73K):
[in this window]
[in a new window]
|
Figure 5.
Neural activity in LMAN associated with repeated
song motif renditions. A-C, Spectrogram
(A), oscillogram (B), and
amplitude envelope (C) of a song repeatedly
produced by bird dc-18. D, Amplitude envelopes of 159 consecutive song renditions recorded over 3 hr, aligned for maximal
overlap between the final four syllables. Darkness is proportional to
sound intensity. E, Multiunit neural activity level
recorded during each song rendition displayed in D.
Activity level has been quantified by rectifying and smoothing raw
activity waveforms. Lower (left) and upper
(right) limits of gray scale bar are 0.75 and 2.25. F, Mean neural activity level (normalized by background
nonsinging activity level: dotted line) across these 159 song renditions. Mean timing of syllable occurrence is overlaid for
comparison of activity to song production. G, Raw
waveforms of neural activity during four consecutive song renditions.
The rectified integrated version of each waveform is plotted above the
raw waveform. Audio files of five representative songs are located at
http://www.keck.ucsf.edu/~neal/jns99/fig5/fig5.html. activity level, for 24 recording sessions from 10 birds.
c, Timing and magnitude of postsong minimum ( ),
connected by a line to the time at which mean activity level rose above
the 30th percentile of background activity level, for 16 recording
sessions from eight birds. E, Reliability of presinging
and postsinging activity changes for one recording. Filled
circles represent, for each song initiation, mean activity
level from 50 to 0 msec before initiation versus mean activity level
from 2 to 1.5 sec before initiation. Open circles
represent mean activity level in a 50 msec window around the average
minimum versus mean activity level in a period from 1.5 to 2 sec after
song termination. The dotted line indicates values at
which activity level in these two periods would be equal. These data
are from the recording session shown in A.
|
|
A representative example of data used in such an analysis is shown in
Figure 5. A common song type of bird
dc-18 contained a variable number of introductory elements (range 1-4
per song, mean 1.7), followed by two stereotyped five-syllable motifs
that were separated by an intervening syllable (Fig.
5A-C, s). Note the similarity in
acoustic structure between the repeated motifs. This song type was
produced 159 times during a 4 hr recording session. The amplitude
envelopes of all successive song renditions, aligned so that the
amplitude envelopes of the final four syllables of the song were most
similar, are shown in Fig. 5D (see Materials and Methods for
details of alignment procedure). The temporal structure of all song
renditions was extremely stereotyped. The alignment of the first motif
appears more variable, primarily because of variability in the interval
between syllable e of the first motif and syllable
s (SD of this interval duration = 6.0 msec, the mean SD
of nine other interval durations = 2.9 msec, range 1.6-4.3 msec).
Across all renditions, the first motif of the songs in this recording
session had a slightly shorter duration than the second motif
(p < 0.01, paired t test,
n = 159; first motif duration = 721 ± 8.2 msec, second motif 736 ± 7.88 msec).
During all of these song renditions, the pattern of activity in LMAN
was relatively consistent in relation to song elements (Fig.
5E). High and low activity levels occurred during similar song components across the recording session, as reflected by the
presence of dark and light vertical bands in Figure 5E. The pattern of neural activity averaged across all song renditions appeared
very similar during production of the first and second motifs (Fig.
5F). Because of the greater variability in alignment of syllables across renditions for the first motif, the mean activity pattern during the first motif was less sharp than that during the
second motif (Fig. 5F). When songs were aligned by
the last four syllables of the first motif, however, the mean activity pattern for the first motif was nearly identical to that for the second
motif in Figure 5F; the correlation coefficient between these activity patterns was then 0.96. In contrast to the
reproducibility of the mean activity pattern when averaged across many
renditions, there was a lower degree of stereotypy when activity was
examined on a rendition-to-rendition basis (as could be seen previously in Fig. 2D,E). This variability, as
well as the characteristic "bursty" quality of LMAN singing-related
activity, is highlighted by the four successive activity waveforms, and
their associated rectified waveforms, displayed in Figure
5G.
Variability in LMAN activity across multiple motif renditions
When activity level in LMAN during many renditions of a motif was
examined, the overall pattern of activity could be seen to resemble the
mean activity level (Fig. 5, compare E,
F). When small numbers of motifs were inspected,
however, it was more difficult to discern any repeated pattern of
activity, because of the variability of activity between individual
motifs. This is clear in a representative recording, in which activity
level in LMAN during 10 successive renditions of the stereotyped song
motif is plotted (Fig.
6A, thin
lines). We quantified the degree of stereotypy of neural activity
across renditions at single recording sites in two ways. As one
measure, we calculated the cross-rendition c.v. of activity level
across all renditions of an identical song motif (for each millisecond
time point in the motif, SD of activity level across all renditions was
divided by the mean activity level) (Fig. 6C, dotted
line). The mean value of this measure across the entire song motif
was similar in magnitude in all recordings from LMAN (Fig.
6D) (only recording sessions with more than 10 motif
renditions were included), with an average of 0.18 ± 0.05. As a
second measure of variability across renditions, we calculated the
correlation coefficients between the activity waveform for each
individual rendition (Fig. 6A, thin lines)
and the mean activity level across all renditions (Fig.
6A, thick line). For this recording in
LMAN, the mean of these correlation coefficients across 360 motif
renditions was 0.43 ± 0.17 (by comparison, for 159 renditions of
the two-motif song in Fig. 5 the mean was 0.47 ± 0.12). Similar
results were obtained for all other birds; across all recording
sessions in LMAN the average correlation coefficient was 0.47 ± 0.09 (Fig. 6E).

View larger version (26K):
[in this window]
[in a new window]
|
Figure 6.
Quantification of response variability across
multiple motif renditions. A, B,
Multiunit activity level during 10 successive motif renditions
(thin lines) for representative LMAN (A,
bird dc-17) and RA (B, bird xr-1) recording sessions,
along with the mean activity level across all renditions over a
stationary recording epoch (thick lines). Average timing
of syllables in the motif is indicated by thick black
lines placed at y = 1. Neural activity waveforms recorded during three successive
renditions of the song motifs are plotted below for both nuclei.
C, Timing of c.v. (SD/mean) across all renditions for
both recordings (n = 61 for LMAN,
n = 95 for RA). D, Average c.v. of
activity level across all renditions of the stereotyped song motif
(recordings from each bird are represented by different symbols).
Empty symbols and crossed lines are used
for 24 recording sessions from 10 birds in LMAN; filled
squares and circles are used for 9 recordings
from 4 birds in HVc and 7 recordings from 5 birds in RA, respectively.
Some symbols have been slightly offset horizontally to reduce overlap
between them. Mean c.v. values across all recordings from nuclei are
denoted by lines to the right of symbols.
E, Mean correlation coefficient of activity level
waveform for each individual rendition with the mean activity level
across all renditions; symbols are as in
D.
|
|
In contrast to the results for LMAN recordings, when we recorded and
analyzed multiunit activity in the same way for the song system motor
nuclei HVc and RA, there was much higher stereotypy across motif
renditions [see also Yu and Margoliash (1996) ; Vu et al. (1998) ]. As
can be seen for representative LMAN and RA recording sites, activity
patterns during successive motif renditions were much less variable for
multiunit responses in RA (Fig.
6A,B). Furthermore, when quantified
in the two ways presented above, activity in nine of nine recording
sessions from four birds in HVc (mean 305 motifs/session) and seven of
seven recording sessions from five birds in RA (mean 65 motifs/session)
was more stereotyped across renditions than that in all recordings from
LMAN (Fig. 6D,E) (mean
cross-rendition c.v. values for HVc and RA recordings were 0.08 ± 0.02 and 0.07 ± 0.01, respectively, whereas mean correlation coefficients between activity during each rendition and mean activity levels were 0.83 ± 0.07 and 0.91 ± 0.03, respectively).
Given the multiunit character of the recordings from these three
nuclei, it is possible that some of the differences in stereotypy reflected a tendency for sites in HVc and RA recordings to include greater numbers of neurons. However, these differences existed even
between LMAN and RA recording sessions that appeared to be sampling a
similar number of units. Furthermore, we have recently shown that
differences in stereotypy such as those between RA and LMAN can be
detected while recording from a single group of neurons in a single
recording session. When birds sang "directed song" aimed at a
recipient, the mean cross-rendition c.v. during 13 recording sessions
from six birds in LMAN was 0.10 ± 0.03, significantly different
from that during undirected singing at the same sites (mean = 0.16 ± 0.03) (Hessler and Doupe, 1999 ). The low variability
during directed singing is also different from that seen here during
undirected singing in LMAN (Fig. 6D) (mean 0.18 ± 0.06) and is more similar to the level of cross-rendition variability seen here during undirected singing in HVc and RA (Fig.
6D).
The variability in activity between single renditions in LMAN was even
more striking when activity of single rather than multiple units was
examined. Figure 7 shows an example of
one recording session in which a single unit was well isolated. During
this period, the bird produced nine renditions of a fairly stereotyped song (Fig. 7A) (song elements indicated by the overlying
line were repeated in all renditions; there were slight variations in
the timing and presence of additional introductory elements, and in one
rendition several additional syllables followed the components beneath
the line). This unit fired much more strongly during singing than
nonsinging periods; the mean firing rate increased from 9.5 to 16.2 spikes/sec. Spikes tended to occur in clusters (the median interspike
interval was 8.19 msec during singing). During multiple renditions of
the same song elements, there was little reproducibility in the pattern
of firing (Fig. 7B,C). To address
this more closely, we examined the pattern of firing during repeated
renditions of a complex syllable that was repeated 45 times during the
session. There were some general tendencies for spikes to occur in
particular points of the syllable, evident in the mean spike rate (Fig.
7F) as well as in the raster plots (Fig.
7E). However, the level of reliability of firing was quite low; during some renditions many spikes occurred at times at which none
occurred during other renditions (Fig. 7F,
asterisk). These preliminary data suggest that at the level
of single units, firing patterns may be even less consistently linked
to production of specific song elements than was indicated by multiunit
recordings.

View larger version (23K):
[in this window]
[in a new window]
|
Figure 7.
Firing of a single unit in LMAN during multiple
song renditions. A, Spectrogram of a typical song
produced by bird mx-17. Song components beneath bar were
produced consistently in nine song renditions. B, Spike
arrival times of action potentials of a single unit during production
of these renditions. Songs were aligned so that the amplitude profile
of the last syllable in the song matched best across renditions.
Stereotypy of timing across the entire song underneath the
bar remained precise. The SD of song termination time
was 2.4 msec, whereas the SD of song initiation time (on average 1290 msec earlier) was increased only to 11.1 msec. C, Neural
activity waveforms from which the top three spike rasters in
B were obtained. D, Spectrogram of a
complex syllable produced 45 times during recording session (* in
A). Timing of production of this syllable was very
stereotyped. Across all renditions, syllable onset time had an SD of
2.9 msec, as did syllable termination time. E, Spike
raster of single-unit firing during multiple renditions of this
stereotyped syllable. F, Mean firing rate across all
renditions of this syllable; spike arrival times were smoothed with a
10 msec Gaussian window.
|
|
Pattern of LMAN activity at different recording sites within
a bird
In contrast to such variability in activity level between
individual renditions at a single site, the pattern of average
multiunit activity related to the song motif was very similar at
different recording sites in LMAN of individual birds. In most birds,
multiple recording sessions were made through the depth of LMAN
(successive recording sites were separated by ~80 µm, except for
one pair in bird mc-j, which was separated by ~40 µm) over a period
of several days to several weeks. In general, peaks and troughs of average activity profiles appeared in similar locations in all recordings from a single bird (data from three representative birds are
displayed in Fig. 8). To quantify the
degree of similarity between activity patterns across multiple
recording sessions, we calculated correlation coefficients between the
mean activity level during different recording sessions from individual
birds. For six of seven birds in which recordings were made at more
than one site, correlations between response profiles at different sites were high, with a mean across birds of 0.76 ± 0.06 (range 0.66 to 0.82) (Fig. 8, mc-j, dc-24). For
the seventh bird, the pattern was more inconsistent between sessions
(Fig. 8, dc-12). This inconsistency did not appear to result
from the greater number of recording sessions in this bird and thus a
slightly increased distance between different recording sites compared
with the other birds. The average correlation coefficient for the three
comparisons between recording sites separated by ~80 µm was
0.12, for two comparisons at 160 µm was 0.06, and for one comparison
at 240 µm was 0.53.

View larger version (25K):
[in this window]
[in a new window]
|
Figure 8.
Pattern of multiunit activity at multiple
recording sites in individual birds. Mean activity level recorded
during production of stereotyped song motifs during three
(mc-j), two (dc-24), and four
(dc-12) recording sessions at different sites in LMAN.
Activity level for each session has been normalized by background
activity level during nonsinging periods. Average timing of syllable
occurrence for each bird is indicated by bars at
y = 1.0. The average number of motifs per recording
session was 279 ± 216. Correlation coefficients between mean
activity waveforms (or averages of these when there are more than 2 sites) are displayed to the right for each bird.
|
|
Temporal relationship between LMAN activity level and
song elements
For all recordings from LMAN, there was a highly phasic activity
profile when neural activity was averaged over multiple motif renditions (Figs. 5F, 8). It was difficult, however, when
comparing the range of responses across all birds, to ascertain a
simple relationship between the mean response profile at a recording site and the complex acoustic structure of the associated song output.
This difficulty is attributable in part to differences in the timing
and structure of songs of different birds. However, even for three
birds that copied their songs from the same tutor and thus sang very
similar motifs (Fig. 9A),
average LMAN activity patterns clearly differed from each other (Fig.
9B).

View larger version (46K):
[in this window]
[in a new window]
|
Figure 9.
Multiunit activity during similar motifs produced
by three different birds. A, Oscillograms (top
panels) and spectrograms (middle and bottom
panels) of stereotyped song motifs of birds dc-8, dc-12,
and dc-17, with gray scale plots of multiunit activity level versus
time (as in Figs. 4-5) during 100 successive renditions of these song
motifs. bg, Background nonsinging activity level during
period from which activity during motifs was recorded, plotted on the
same gray scale axis as activity during motifs. B, Mean
pattern of activity across 100 renditions of the song motif for
each bird, normalized by background activity level. Individual birds
are represented by line styles shown at
left in A. Mean timing of syllable
occurrence for bird dc-17 is overlaid for reference. Audio files of
representative song motifs for these birds are located at
http://www.keck.ucsf.edu/~neal/jns99/fig9.html.
|
|
Therefore, we used a simple measure to compare the relationship of
activity with song structure across birds singing different songs. We
determined when activity tended to occur relative to vocalizations,
without consideration of syllable identity. The way in which this was
done is summarized in Figure
10A-C,
using the data shown in Figure 9A for bird dc-17. For every
motif rendition, the cross-covariance between the activity level trace
and a discretized version of the associated song motif (Fig.
10A) was calculated, with the neural activity trace
shifted from 100 to + 100 msec relative to the song (Fig.
10B). Averaged across all renditions (Fig.
10C), this cross-covariance function had a peak when neural activity was shifted forward in time (in the plot, to the
right) relative to song by 35 msec (Fig.
10A, gray line) and a trough when activity
was shifted backward relative to song by 21 msec. Across all motif
renditions, the values of the peak and minimum were significantly
different from zero (p < 0.01, t
test; averaged in a 15 msec window at peak and minimum). Although the
mean firing patterns of the birds in the top two panels of
Figure 9A were quite different from that for the bird dc-17
analyzed in Figure 10A-C, the mean
cross-covariance functions for them also had a peak value when activity
was shifted forward relative to songs.

View larger version (35K):
[in this window]
[in a new window]
|
Figure 10.
Quantification of relationship of activity
patterns to motif elements. A, Binary representation of
syllables (thin black line) in a representative motif of
bird dc-17 (taken from recording session shown in Fig.
9C). The presence of a syllable is represented by a
value of 1, the absence of a syllable by a value of 0. Multiunit
activity level during one rendition of this song motif is indicated by
the thick black line. The thick gray line
denotes the same activity trace after it was shifted by +35 msec to the
right relative to the syllable trace (i.e., shifting it by the amount
that produced the maximum average cross-covariance between activity and
song). B, Magnitude of cross-covariance for 100 renditions of the motif; for each rendition, the cross-covariance
between activity and the discretized song was calculated at successive
1 msec steps, with activity waveform progressively shifted from +100 to
100 msec relative to song. Lower and upper limits of gray scale bar
are 0.3 and 0.4. C, Mean cross-covariance across all
motif renditions. D, Summary plot shows peak and trough
locations and amplitudes for the mean cross-covariance function in each
LMAN recording from every bird (each bird is represented by a different
symbol).
|
|
Such a peak for forward activity shifts was seen in all but one
recording session from all birds, and in all cases the value of the
cross-covariance function at the peak was significantly different from
zero across all motif renditions (Fig. 10D)
(t test, all p < 0.01 for each of 24 recording sessions from 10 birds in which more than 10 motifs were
produced, average 325 motifs per session, range 23-750). Over all
recordings, the peak location averaged across all renditions in a
session was similar to that in Figure 10C, averaging 32 ± 14 msec forward relative to song. For 8 of 10 birds there was also a
significant minimum across all renditions when neural activity was
shifted backward in time relative to syllables (averaged in a 15 msec
window at the minimum; t test, all p < 0.01 for each of 20 recording sessions from eight birds). Thus, as in the
example in Figure 10A, shifting the neural activity
pattern forward relative to the song tended to increase the overlap
between peaks of activity and syllables, which resulted in the peak in
the mean function. Similarly, shifting neural activity backward
relative to song often placed peaks of activity within an intersyllable
interval and reduced the covariance between activity and syllables.
These results reflect the general tendency (Figs. 5E,F, 8, 9) for peaks of LMAN
activity to occur near the beginning of syllables, usually just before them.
As a way of comparing the activity of LMAN and the motor pathway, we
performed the same analysis on the relationship of HVc activity to song
elements during undirected singing. The mean cross-covariance functions
for these recordings were quite similar to those obtained from LMAN. On
average, peak locations for HVc recordings were at a shift of 33 ± 5 msec forward (to the right) relative to song elements (range
24-38 msec, n = 9 recording sites from four birds).
Thus, by this assay, the average temporal relationship between neural
activity and song components is similar for HVc and LMAN, although
given the variability between birds, it will ultimately be important to
compare such analyses in recordings from the same birds
Singing-related firing of Area X neurons
The AFP nucleus Area X receives the initial input from the song
system motor pathway nucleus HVc (Fig. 1B), so if
singing-related activity in LMAN originates in HVc it should be present
in Area X as well. We therefore investigated whether the firing of
neurons in Area X was also modulated during singing. As in LMAN,
recordings were made both while birds were silent and while they were
producing undirected song.
The background firing characteristics of Area X were quite different
from those of LMAN. Although all recordings from within LMAN sampled a
grossly similar pattern of multiunit activity, with a range of sizes of
spikes, at many sites in Area X no clear neural activity could be
distinguished above the recording noise level. At intervals of hundreds
of micrometers, however, a fast regular firing pattern of spontaneous
activity could be recorded (Fig.
11A). Because these
spikes were significantly larger than others recorded by the electrode,
single units or clusters of one to two units could be readily
discriminated from action potentials of smaller cells, in contrast to
the situation in LMAN. The average background spike rate of such
regular-firing single units was 160 Hz (range 118-197 Hz,
n = 7 units), higher than that reported for these
neurons in anesthetized birds (the highest rate for adult Area X units
was 75 Hz) (Doupe, 1997 ).

View larger version (49K):
[in this window]
[in a new window]
|
Figure 11.
Firing of a single unit in Area X is associated
with song production. A, Spontaneous singing in a 10 sec
period by bird mx-13, represented in spectrogram and oscillogram
formats (top two panels), along with the concurrently
recorded neural activity of a single unit in Area X (bottom
panel). Below the raw neural waveform is plotted the
instantaneous firing rate of this neuron, calculated by convolving
spike arrival times with a 10-msec-wide Gaussian window.
B, Spectrogram of vocalization (top
panel) and concurrently recorded neural activity
(bottom panel) from two 1 sec epochs at the
beginning (left) and end (right) of song
in A, plotted on an expanded time scale. Pauses in
neural firing are denoted by asterisks above neural
activity waveform. Audio file of song in A is located at
http://www.keck.ucsf.edu/~neal/jns99/fig11.wav.
|
|
The firing of Area X neurons, despite the differences from LMAN in
background level, also changed conspicuously during singing. A
representative example of such a response is shown in Figure 11. From a
stable fast firing rate of ~100 Hz, this neuron increased its firing
rate to >200 Hz during singing. Note also, especially in the expanded
time scale in the bottom panel, the occurrence of pauses in
firing before vocal output (marked by asterisks above the
neural activity waveform in Fig. 11B). Overall,
increases in firing rate during singing were seen in 11 of 13 single-unit and small-cluster recordings from six birds (Fig.
12A,B).
For two sites, however, the mean firing rate during singing was
decreased relative to background levels (Fig.
12A,B; one in mx-11, one
in dc-24). For these sites, as was true to a lesser
extent for many sites with overall firing increases, pauses of firing
were more frequent during singing than during quiet periods (reflected
in the decreased 10th percentile of firing rate distribution compared
with background).

View larger version (18K):
[in this window]
[in a new window]
|
Figure 12.
Singing-related modulation of neural firing was
seen in all recordings from Area X. A, Distributions of
instantaneous spike rate for seven single-unit recordings from four
birds, recorded during background (b, left distribution
for each recording) and singing (s, right distribution
for each recording) periods. In each plot, upper and
lower horizontal ticks denote the 90th and 10th
percentiles of the firing rate distribution, whereas the middle
thick tick denotes the median. B, Distributions
of instantaneous firing rate for six small cluster recordings from four
birds. Individual recordings represented as in A. The
multiunit firing rate for each recording has been normalized by
background nonsinging rate, because the number of neurons contributing
to a recording site may have differed between recording sessions. The
average duration of singing and background recording periods used were
67 and 38 sec, respectively.
|
|
As was true for LMAN neurons, the activity of Area X neurons was
modulated before and after singing. Figure
13 illustrates an example of such
modulation for a recording from a small cluster of neurons in Area X. Before song initiation, there was an increased frequency of both pauses
and bursts of firing (see also Fig. 11). After song termination, there
was a slow decline in firing rate. Such a gradual decrease to
background level was different from the postsinging inhibition
consistently seen in LMAN recordings (Fig. 4Dc) and
was observed for all Area X recordings in which we obtained a
sufficient number of silent postsinging epochs for analysis (>10
epochs, n = 4 recording sessions from four birds). The
mean activity level during the initial 300 msec after song termination
for these four recordings was 1.33 ± 0.17 times background activity level (see also Fig. 11; compare with Fig.
4A-C, right panels).

View larger version (58K):
[in this window]
[in a new window]
|
Figure 13.
Firing of a small cluster of neurons in Area X is
consistently modified before and after singing. A,
Firing rate during multiple presinging and postsinging intervals of
bird mc-g. Instantaneous firing rate was estimated by convolving
multiunit spike arrival times with a 10-msec-wide Gaussian window.
Firing rate for each song initiation and termination is plotted on
successive rows as in Figure 4, with darkness proportional to spike
rate. Absolute spike rate has been normalized by background nonsinging
spike rate. Lower and upper limits of gray scale bar are 0.2 and 3.0. B, Mean firing rate across multiple song initiations and
terminations, normalized by background firing rate. Song initiation and
termination times are denoted by vertical dotted lines.
C, Representative neural activity waveforms from which
firing rates in A and B were calculated.
Note that the time scale for this panel is expanded from that in
A; 1 rather than 2 sec of pre-onset and post-termination
activity level is displayed. Song initiation and termination times are
denoted by vertical dotted lines.
|
|
Although Area X neurons always modulated their firing during repeated
motifs, the precision of firing relative to song elements during
individual song renditions was rather low. The unit shown in Figure
14 greatly increased its firing during
the song motif, from a background rate of 115 Hz. It was difficult,
however, to discern any tendency for greater firing to occur at any
particular time during the motif. In all other recordings from Area X,
as well, neural firing appeared more weakly related to individual song
elements than in LMAN, although this may in part reflect the smaller
number of units contributing to each Area X recording compared with
recordings in LMAN. When the cross-covariance function between activity
level and song elements was calculated for Area X recordings (as in
Fig. 10), there was less consistency in locations of maxima and
minima across different birds than was seen for LMAN recordings. In
nine recording sessions from six birds (mean 87 motif
renditions/session, range 13-267), average locations of maxima ranged
from 100 to + 68 msec, and of minima from 100 to + 100 msec
relative to song; this was distinct from the similarity across birds in
locations of such peaks and minima for LMAN (from Fig.
10D, LMAN peak location = +32 ± 19 msec,
mean minima location = 27 ± 19 msec). Furthermore, the
pattern of activity appeared even less stereotyped across multiple song
renditions than was true for LMAN. For the nine recording sessions from
Area X, the mean cross-rendition c.v. was 0.41 ± 0.22 (range
0.2-0.86; by comparison, the mean for all LMAN recording sessions was
0.18 ± 0.05) (Fig. 6D). Similarly, for Area X
sites the mean correlation coefficient between activity level during
individual renditions and mean activity level was 0.26 ± 0.08 (range 0.16-0.39; the mean for all LMAN recording sessions was
0.48 ± 0.08) (Fig. 6E).

View larger version (53K):
[in this window]
[in a new window]
|
Figure 14.
Single-unit firing in Area X during multiple
renditions of a stereotyped song motif. A, Spectrogram
(top panel) and oscillogram (bottom
panel) representations of a typical song motif of bird
mx-11. B, Single-unit spike arrival time raster plot of
firing during multiple renditions of this song motif. C,
Average instantaneous spike rate, made by convolving all spike arrival
times with a 10-msec-wide Gaussian window. The mean background
nonsinging spike rate is indicated by dotted line.
D, Successive activity waveforms produced during this
motif. E, Activity waveforms of equivalent duration recorded during a
nonsinging period. Audio files of four representative song motifs of
this bird are located at
http:/www.keck.ucsf.edu/~neal/jns99/fig14.html.
|
|
Influence of auditory feedback on singing-related activity
in LMAN
Various indirect evidence suggests that the AFP is
involved in processing reafferent auditory feedback to the song system during learning. The AFP is required during song acquisition and modification (Bottjer et al., 1984 ; Sohrabji et al., 1990 ; Scharff and
Nottebohm, 1991 ; Morrison and Nottebohm, 1993 ), and in both adult and
juvenile anesthetized birds it contains neurons that respond
selectively to playback of the bird's own song (Doupe and Konishi,
1991 ; Doupe, 1997 ; Solis and Doupe, 1997 ). To begin to investigate
whether the AFP has an auditory role during singing, we used two
methods of dissociating the activity associated with the motor act of
singing from activity elicited as an auditory response to the bird's
self-produced vocalizations. First, we deafened birds to remove the
reafferent sensory signal during song production, and second, we played
back the sound of the bird's own song outside of the context of
singing. Here, we determined the effect of these manipulations on the
activity of neurons in LMAN.
Complete removal of auditory feedback does not strongly affect
singing-related firing
Complete deafening did not immediately affect song production, as
has been observed previously for zebra finches (Nordeen and Nordeen,
1992 ). An example of this result for a representative bird is shown in
Figure 15A. In one recording
session, this bird produced >300 renditions of a stereotyped
seven-syllable song motif (Fig. 15A, hear). On
the next day, when no auditory feedback was possible (after cochleae
had been removed), the bird produced a very similar motif (Fig.
15A, deaf). Fundamental frequencies of
syllables with prominent harmonic stacks (syllables d, e,
f) were very similar between the two recording sessions
(the largest difference in fundamental frequency was 0.4%; from
813 ± 7.9 to 816 ± 7.8 Hz for syllable
f). Other parameters of syllable structure such as
noisiness and timbre (relative weighting of harmonic frequencies within
a stack) also did not appear grossly affected by deafening (little
difference could be discerned by an experienced listener; for example,
compare hear and deaf song audio files for bird
dc-8 at http://www.keck.ucsf.edu/~neal/jns99/fig15.html). Moreover, there was little difference in the temporal structure of motifs: although motif duration was significantly shorter 1 d after than just before deafening (p < 0.01, t
test; 345 motif renditions before deafening, 250 after deafening), the
magnitude of decrease was very small: from 956 ± 15 to 953 ± 14 msec. Furthermore, this magnitude of difference is within the
range of normal daily variability: motif duration 1 d before
deafening was 947 ± 15 msec.

View larger version (42K):
[in this window]
[in a new window]
|
Figure 15.
LMAN activity during singing before (while
hearing) and after deafening. A, Spectrograms of song
motifs produced by bird dc-8 several hours before (top
panel, hear) and 1 d after (bottom
panel, deaf) deafening. Syllable identity
is indicated by letters between spectrograms.
B, Multiunit activity level in LMAN during a
representative 150 renditions of the song motif, during a recording
session when the bird could hear (top panel), and
after deafening (bottom panel). Activity level
for both recordings was normalized by average background activity
level. Lower and upper limits of gray scale bar are 0.6 and 2.3. C, Mean activity level during 300 renditions of motif
before deafening (thick trace) and 161 renditions of
motif after deafening (thin trace). Average timing of
|
|
|