Abstract
Sex steroids modulate vertebrate sensory processing, but the impact of circulating hormone levels on forebrain function remains unclear. We tested the hypothesis that circulating sex steroids modulate single-unit responses in the avian telencephalic auditory nucleus, field L. We mimicked breeding or nonbreeding conditions by manipulating plasma 17β-estradiol levels in wild-caught female Gambel's white-crowned sparrows (Zonotrichia leucophrys gambelii). Extracellular responses of single neurons to tones and conspecific songs presented over a range of intensities revealed that estradiol selectively enhanced auditory function in cells that exhibited monotonic rate level functions to pure tones. In these cells, estradiol treatment increased spontaneous and maximum evoked firing rates, increased pure tone response strengths and sensitivity, and expanded the range of intensities over which conspecific song stimuli elicited significant responses. Estradiol did not significantly alter the sensitivity or dynamic ranges of cells that exhibited non-monotonic rate level functions. Notably, there was a robust correlation between plasma estradiol concentrations in individual birds and physiological response properties in monotonic, but not non-monotonic neurons. These findings demonstrate that functionally distinct classes of anatomically overlapping forebrain neurons are differentially regulated by sex steroid hormones in a dose-dependent manner.
Introduction
Sex steroid hormones modulate vocal signaling in adult vertebrates (Bass, 2008; Brenowitz, 2008), but how these modulations impact the auditory function of listeners remains unclear. Songbirds are well suited for addressing this issue. Song is a complex, learned vocalization that serves several functions, including species and individual identification, mate attraction, and territory defense (Catchpole and Slater, 1995). In seasonal breeders, such as Gambel's white-crowned sparrow, song behavior is sensitive to hormonal state; high levels of circulating sex steroid hormones, typical of the breeding season (Wingfield and Farner, 1978), increase singing rate, song duration, and song stereotypy (Smith et al., 1995; Meitzen et al., 2009a). Associated changes in the morphology and physiology of the neural circuit underlying song production are also observed (Nottebohm, 1981; Brenowitz et al., 1991; Smith et al., 1997; Tramontin et al., 2003; Soma et al., 2004; Park et al., 2005; Meitzen et al., 2007a,b, 2009b; Phillmore et al., 2011).
Recent work has explored the effect of sex steroid hormones on songbird auditory sensitivity at the level of the periphery and brainstem (Henry and Lucas, 2009; Caras et al., 2010). Additional studies have examined the impact of sex steroid hormones on the processing of song stimuli in regions specialized for song perception or production such as the sensorimotor nucleus HVC (proper name), the caudomedial nidopallium (NCM) or the caudomedial mesopallium (CMM) (Maney et al., 2006; Tremere et al., 2009; Remage-Healey et al., 2010, 2012; Sanford et al., 2010; Phillmore et al., 2011; Tremere and Pinaud, 2011; Remage-Healey and Joshi, 2012).
Many issues remain unexplored. First, neurons in NCM, the main focus for the majority of studies on this topic, express hormone receptors (Bernard et al., 1999; Gahr, 2001; Jeong et al., 2011). Receptor expression is not a prerequisite for hormonal sensitivity, however, as steroid action can be mediated via other neuromodulatory systems (Maney and Pinaud, 2011). It is therefore of interest to determine whether auditory regions upstream of NCM, some of which lack sex steroid receptors, are also affected by hormonal state. Similarly, it is unclear whether circulating sex steroids modulate fundamental aspects of auditory processing in the forebrain, and if so, whether the magnitude of these modulations depends on the plasma level of hormone. To address these questions, we brought adult female sparrows into breeding [high 17β-estradiol (E2)] or nonbreeding condition (low E2) in the laboratory and made in vivo extracellular recordings from single units in the forebrain field L complex. Field L is the primary thalamic recipient of auditory information and is analogous to mammalian primary auditory cortex (Fortune and Margoliash, 1992; Vates et al., 1996; Reiner et al., 2004) (see Fig. 1B). Unlike cells in downstream nuclei, field L neurons do not express steroid receptors (Jeong et al., 2011; Maney and Pinaud, 2011). We found that modulations of systemic E2 affect many fundamental response properties in monotonic field L neurons.
Materials and Methods
Subjects
Adult female Gambel's white-crowned sparrows (n = 21) were captured in eastern Washington state during autumn and spring migrations between 2007 and 2011. Birds were housed in outdoor aviaries at the University of Washington for up to 30 weeks before being moved to indoor aviaries. Once inside, all birds were housed in groups on a short-day photoperiod (SD) (8 h light/16 h dark) for a minimum of 10 weeks to ensure sensitivity to the stimulating effects of hormones and photoperiod (Wingfield et al., 1979). Food and water were available ad libitum. All procedures were approved by the Institutional Animal Care and Use Committee at the University of Washington, Seattle.
Hormone and photoperiod manipulations
Birds were brought into either nonbreeding-like condition or breeding-like condition in the laboratory. To induce a nonbreeding condition, we housed birds (n = 12) on a SD photoperiod as above. Birds housed on a SD photoperiod maintain regressed gonads, have basal plasma sex hormone levels, and display neural morphology and physiology typical of the nonbreeding season (Middleton, 1965; Smith et al., 1995; Tramontin et al., 2000; Park et al., 2005; Meitzen et al., 2007b). To induce a breeding condition, we housed birds (n = 9) on a long day (LD) (20 h light/4 h dark) photoperiod typical of their Alaskan breeding grounds and implanted them with subcutaneous hormone pellets made from SILASTIC tubing (inner diameter, 1.0 mm; outer diameter, 2.0 mm, length, 12 mm; VWR). Pellets were filled with crystalline E2, rinsed in ethanol, and soaked overnight in 0.1 m PBS before implantation (Tramontin et al., 2003). Supplemental hormone is necessary to raise plasma hormone levels of laboratory-housed birds to physiological levels observed in breeding birds in the wild (Smith et al., 1995). Birds were housed under these conditions for 3 weeks; this time period is sufficient to induce neural morphology and physiology typical of the breeding season (Tramontin et al., 2000; Park et al., 2005; Meitzen et al., 2007b).
Electrophysiology
Surgical procedures
All experiments took place in a double-walled acoustically isolated chamber (Acoustic Systems). At the beginning of each experiment, birds were anesthetized with 25% urethane (6 μl/g body weight; Thermo Fisher Scientific), divided evenly into three intramuscular injections separated by 30 min. Supplementary doses (0.67 μl/g) were delivered throughout the experiment to maintain anesthetic state as assessed by toe pinch. After birds were fully anesthetized, we injected 0.1 ml of 1% lidocaine (APP Pharmaceuticals) subcutaneously at the dorsal midline of the skull, made an incision, and removed the skin and fascia. A metal post was fixed to the skull with dental cement (Lang Dental), and birds were secured to a head holder/stereotaxic device. Body temperature was maintained at 40–42°C by a heating pad using a cloacal thermal probe and digital controller (TC-1000 Temperature Controller; CWE). A small craniotomy was made dorsal to field L in the right hemisphere using stereotaxic coordinates relative to the bifurcation of the midsagittal sinus (1.4 mm lateral, 1.8–2.3 mm anterior). The dura was removed and a glass micropipette electrode (5–19 MΩ impedance) filled with 10% Fluoro-Ruby [10,000 molecular weight (MW) tetramethylrhodamine dextran; Invitrogen] or 10% biontinylated detran amine (BDA) (10,000 MW, Invitrogen) in 0.9% NaCl was positioned over the opening. The electrode was advanced by an electric microdrive (Newport), which was controlled by the experimenter from outside the sound attenuation booth. For some recording sessions, the craniotomy opening was covered in petroleum jelly to prevent tissue dehydration. We made one to three electrode penetrations in each bird. Although we recorded activity at a wide range of depths (∼800–3300 μm), we restricted our analysis to units that were confirmed histologically to be within field L (see below, Electrode track reconstruction).
Auditory processing is lateralized in songbirds, although the exact nature of hemisphere specificity depends on sex, species, brain area, anesthetic state, stimulus selection, and method of analysis (Cynx et al., 1992; George et al., 2004, 2005; Avey et al., 2005; Hauber et al., 2007; Poirier et al., 2009; Phan and Vicario, 2010). To avoid introducing a lateralization confound into our experimental design, we chose to focus only on the right hemisphere.
Stimulus delivery and calibration
The stimulus delivery system we employed has been used previously (Caras et al., 2010). Briefly, a small speaker (Etymotics ER-2B) and microphone (Etymotics ER-10B) were enclosed within a custom-made sound delivery tube and positioned flush against the skull surrounding the left external auditory meatus. Petroleum jelly was applied to the outside of the tube and skull, creating a closed sound delivery system. Sound delivery was controlled by custom scripts (Python) running on a computer located outside the sound attenuation chamber. Stimuli were routed through an RX6 multifunction processor (Tucker-Davis Technologies) that performed both digital/analog conversion and attenuation of the signal before delivery to the speaker.
Before each experiment, we used random-phase band-limited (6 Hz to 20 kHz) white noise to calibrate pure-tone sound pressure levels (decibels SPL re: 20 μPa). For our initial experiments, we used the white-noise generated calibration table to determine root-mean squared sound pressure levels (RMS decibels SPL) for song stimuli. In later experiments, we presented individual songs to the microphone and determined RMS decibel SPL values separately for each song. The levels for earlier recordings were corrected for each song type presented. RMS amplitudes for song stimuli were reliable within ≤4.9 dB SPL.
Auditory stimuli
We presented two different types of stimuli in this study. Pure-tone stimuli were 100 ms in duration with 5 ms linear ramp rise–fall times. Tones were generated on-line using the same custom software that controlled sound delivery. Song stimuli consisted of a set of songs recorded from seven individual male sparrows held under breeding condition in the laboratory. We used male songs because females of this species do not sing. We recorded songs using Syrinx software (John Burt; www.syrinxpc.com, University of Washington, Seattle, WA) and previously published protocols (Meitzen et al., 2007b, 2009a). Low-frequency background noise was digitally filtered off-line. We recorded one song from each bird, for a total of seven songs.
Gambel's white-crowned sparrow songs typically consist of five syllables: a whistle, a warble, and three buzzes (see Fig. 1A). The songs presented in this study were 2.15 ± 0.19 s in duration (mean ± SD) and spanned an average frequency range of 2.44–5.98 kHz. These values are similar to those previously published for a larger song set (Meitzen et al., 2009a).
The majority of song stimuli (see Fig. 1A, 1 through 4) were recorded from captive males before 2007, and thus were unfamiliar to all of our experimental birds. Three songs (see Fig. 1A, 5 through 7) were recorded from males that had overlapping periods of captivity with some of the experimental birds, and therefore may have been familiar. Although song familiarity can affect neurophysiological responses in more specialized auditory regions, such as NCM, field L does not display this characteristic (Theunissen et al., 2004). We therefore did not include song familiarity as a factor in our data analysis.
Data acquisition
We recorded the extracellular activity of well isolated single units. Spikes were amplified 10,000× (ISO-80; World Precision Instruments; and MA3; Tucker-Davis Technologies), bandpass filtered 0.1–10 kHz with a 24 dB/octave roll-off (Krohn-Hite model 3550), digitized at 24.4 samples/s (RX6 multifunction processor; Tucker-Davis Technologies), and monitored on-line via a digital oscilloscope and audio speaker. Custom data acquisition software displayed spike trains, isolated waveforms, and raster plots in real time. We analyzed raw waveforms off-line using custom MATLAB scripts (David Schneider and Sarah Woolley, Columbia University, New York, NY) to ensure that only well isolated single units were included in the dataset. Neurons were assessed to be well isolated by the following criteria: (1) a stable waveform shape, (2) a high (>4) signal-to-noise ratio, and (3) the absence of any interspike intervals <1 ms. The vast majority of recordings (71 of 77) met all three criteria. The remaining six recordings demonstrated the presence of two clearly separable waveforms with high signal-to-noise ratios. These waveforms were manually sorted off-line; sorting efficacy was additionally verified by principal-components analysis.
Band-limited white noise (0.25–8 kHz) at 80 dB SPL was used as a search stimulus. Once a unit was well isolated, we presented song and tone stimuli, with the order of presentation randomized across cells. For song trials, we chose one song exemplar at random and presented it from 90 to 10 dB SPL in 10 dB descending steps at a rate of 0.14/s. We recorded 5–10 trials at each intensity. For tone presentations, we initially estimated the characteristic frequency (CF) and best threshold of the unit on-line. We then presented a range of frequencies around the CF, in increments approximately equal to 10% of the CF. Each frequency was presented from 90 to 10 dB SPL in 10 dB descending steps at a rate of 1.25/s to construct the full response area of the unit. We recorded 5–10 trials for each frequency–intensity pair, with the order of frequency presentation randomized across trials.
It should be noted that the stimulus intensities used here (10–90 dB SPL) are similar to the sound amplitudes that would be experienced by free-living birds in the wild. Avian species are capable of singing at high intensities, with maximum values ranging from 74 to 105 dB SPL at 1 m (Brackenbury, 1979; Brenowitz, 1982), although some species can generate song amplitudes as high as 111.5 dB SPL [e.g., the Screaming Piha (Lipaugus vociferans) (Nemeth, 2004)]. Therefore, we consider the stimulus levels used here to be within a normal, ethologically relevant range.
Data analysis
Tone responses.
A unit was considered tone responsive if its average stimulus-evoked firing rate was significantly different (Student's paired t test, p < 0.05) than its average spontaneous firing rate (calculated from the 100 ms immediately preceding tone onset). The vast majority (54 of 56) of tone responses were excitatory. The remaining two cells (one from a nonbreeding female, one from a breeding female) gave what appeared to be postinhibitory rebound responses, as evidenced by strong excitation immediately following tone offset. If these cells were truly exhibiting postinhibitory rebound, their firing rates should be suppressed during tone presentation. The spontaneous firing rates of the cells were already quite low (1.48 and 1.67 spikes (sp)/s, respectively), however, making it difficult to detect a suppressive response. Because these potentially suppressive responses were so rare, we removed them from the tone analyses. Both of these cells did show suppressive responses to songs and were included in the song analyses (see below, Song responses).
To determine the pure-tone sensitivity of a unit, we measured the threshold for each stimulus frequency. Threshold was defined as the lowest intensity (decibels SPL) to elicit a significant response. An additional criterion was that successively higher level stimuli must also elicit reliable responses. The CF was identified as the stimulus frequency with the lowest threshold. If multiple frequencies had the same (lowest) threshold, CF was defined as the stimulus with the greatest response strength (RS) at threshold. Here, we define RS as the difference between the average stimulus-evoked firing rate and the average spontaneous firing rate during the 100 ms immediately preceding tone onset, a window equal to the duration of the tone.
We measured the frequency bandwidth 10 dB above the best threshold of the neuron as an indicator of frequency tuning. In addition, we made the following measurements of the responses to the CF: First, we identified the maximum average evoked firing rate, and the stimulus intensity that elicited the maximum response (max decibels). Second, we set a noise floor 2 SDs above the baseline rate of the neuron. We then defined the firing rate range (spikes per second) of the neuron as the difference between the noise floor and the maximum evoked firing rate. Third, we calculated the dynamic range of the neuron, or the range of stimulus intensities within which a neuron is sensitive to differences in intensity. The dynamic range (decibels SPL) was calculated as the difference between the max decibels and the threshold.
During the early phases of our studies, it became clear that neuronal responses in field L could be either monotonically related to tone intensity at CF, or non-monotonic. Monotonic and non-monotonic neurons are thought to play different roles in auditory coding (Polley et al., 2006; Sadagopan and Wang, 2008; Watkins and Barbour, 2011), raising the possibility that breeding condition might modulate each neuronal population in a distinctive manner. We therefore chose to analyze monotonic and non-monotonic responses separately, as discussed below.
To objectively determine whether a cell should be considered monotonic or non-monotonic, we set a boundary halfway between the noise floor and the maximum average evoked firing rate. A neuron was considered non-monotonic if its average evoked firing rate dropped below this boundary at stimulus intensities above the decibel level that evoked the maximum firing rate. If the cell maintained a high evoked firing rate, staying above this boundary, it was considered monotonic (see Fig. 2).
To determine whether these categorizations truly reflected two separate populations of neurons, we calculated a monotonicity index (MI) for each cell. The MI ranges from 0 to 1, with increasing values indicative of increasing degrees of monotonicity. Similar measures of monotonicity have been used previously by other researchers (Sutter and Schreiner, 1995; Recanzone et al., 2000; de la Rocha et al., 2008; Watkins and Barbour, 2011). The MI was calculated for each cell as follows: MI = Rate evoked at highest pure-tone amplitude presented/ maximum evoked rate of the neuron. In the majority of our cases (27 of 28 monotonic and 23 of 25 non-monotonic neurons), the highest pure-tone amplitude tested was 90 dB SPL. In the remaining three cases, the highest amplitude tested was 80 dB SPL.
One non-monotonic neuron recorded in a breeding female had particularly strong tone and song responses. To determine whether this cell was an outlier, we averaged tone and song-evoked |RS| values separately across stimulus level for each non-monotonic cell recorded under breeding condition. We used these average values to perform Dixon's Q test for outliers (Dixon, 1950). We found that the cell in question was an outlier at the 99th confidence interval (Rorabacher, 1991) for both tone and song-evoked responses. We therefore removed this cell from all analyses.
Song responses.
To determine whether a unit was responsive to song, we first established a noise floor 2 SDs above and below the spontaneous rate of the neuron. For a unit to be considered song responsive, its evoked firing rate had to fulfill the following criteria at a minimum of two consecutive song intensities: (1) surpass the noise floor and (2) be statistically different (Student's paired t test, p < 0.05) than the average spontaneous firing rate during the 2000 ms immediately preceding song onset, a window approximately equal to the duration of each song stimulus. We found that these criteria reliably included units that were considered responsive by an experienced observer, while minimizing false positives. Two neurons clearly responded to song, but only at the highest stimulus intensity tested, and therefore could not meet the response criteria. An observer experienced in single-unit physiology blinded to the experimental conditions examined raster and poststimulus time histogram (PSTH) plots. A decision to include these cells in the analysis was made after this observer agreed that the neurons showed increased activity during song presentation.
A unit's song threshold (decibels SPL) was defined as the lowest of at least two consecutive intensities to elicit a significant response. We then identified the maximum average evoked firing rate (for excitatory song responses), the minimum average evoked firing rate (for suppressive song responses), and the stimulus intensity that elicited the maximum or minimum firing rate (max or min decibels, respectively). We calculated the song dynamic range (decibels SPL) as the difference between max or min dB and the threshold. Finally, similar to tones, we used RS (spikes per second) as a measure of response magnitude. Songs elicited both excitatory and suppressive responses (see Fig. 6), however, which resulted in positive and negative RS values, respectively. To analyze all song responses as a whole, we used the absolute value of RS.
Electrode track reconstruction
Two injections of either 10% Fluoro-Ruby (20 of 21 birds) or 10% BDA (1 of 21 birds) were made at the end of each electrode penetration to enable off-line reconstruction of recording sites. Fluoro-Ruby was injected iontophoretically through the recording pipette by using a current source (BAB-501; Kation Scientific) set to +10 μA for 1 min, followed by +4 μA (alternating 7 s on/off) for 8 min. BDA was injected with 5–10 rapid 40 ms pulses of nitrogen gas at 20 psi using a picospritzer (Parker).
At the end of each recording session, birds were perfused transcardially with ice-cold PBS, followed by 4% paraformaldehyde. Brains were removed, postfixed in paraformaldehyde, cryoprotected in 30% sucrose, embedded in gelatin, and postfixed in a 20% sucrose/10% neutral buffered formalin solution for 48 h. Parasagittal 40 μm sections were cut on a freezing microtome and floated in 0.05 m PB. Sections were mounted onto gelatin-subbed slides and processed for Nissl; alternates were air dried until fluorescent or BDA processing.
Sections containing Fluoro-Ruby injections were cleared in xylene, coverslipped in DPX mounting medium (Electron Microscopy Sciences), and dried overnight. Sections containing BDA injections were incubated in 30% hydrogen peroxide in 100% methanol, rehydrated in PBS, incubated in ABC (Vector Laboratories), and visualized using DAB (3′,3-diaminobenzidine) (Sigma-Aldrich). All images were captured on a Olympus BH2 microscope fitted with a QImaging camera and QCapture software.
Only units that could be localized unambiguously to field L were included in our analyses. It should be noted here that differences in spectrotemporal tuning have been reported for the different subregions of the field L complex (Sen et al., 2001; Nagel and Doupe, 2008; Kim and Doupe, 2011), raising the possibility that E2 has disparate effects on these different areas. There was insufficient statistical power to allow analysis by subregion, however, as our experimental design already consisted of multiple independent variables. We therefore did not separate our recording sites into anatomical subregions for our analysis.
Hormone measurement
Immediately before each recording session, we collected blood from the alar wing vein of each bird into a heparinized tube and centrifuged the sample at 4°C. Separated plasma was stored at −80°C until ELISA. Estradiol levels were measured using a kit (Cayman Chemicals) that had not previously been used with this species, so the assay was first validated as described below.
Multiple controls were used to assess the validity of the kit. First, plasma samples were pooled from multiple sparrows and stripped of steroids by incubating with dextran-coated charcoal in assay buffer (Sigma-Aldrich). This stripped plasma is expected to contain no, or very low levels of estradiol. Second, stripped plasma was spiked with E2 to 3200 pg/ml and serially diluted. This serial dilution is expected to parallel the standard curve of the kit. Third, raw (unstripped) plasma was divided into two samples, one of which was spiked with 1000 pg/ml. These samples are expected to differ in E2 concentration by exactly 1000 pg/ml, and thus is a test of the precision of the kit. Finally, to determine whether lipids or proteins endogenous to white-crowned sparrow plasma interfere with the assay, hormones were extracted from all of the samples outlined above, reconstituted in assay buffer, and assayed separately.
To extract hormones, anhydrous diethyl ether was added to each sample aliquot and vortexed for 1 min. The ether fraction was pipetted into a new test tube, and the extraction was repeated for the remaining plasma layer. Ether fractions were combined for each sample and evaporated under nitrogen gas. Dried, extracted hormone was resuspended in the kit assay buffer and samples were stored at 4°C until use.
Results from the validation assay were as expected: stripped plasma contained extremely low levels of estradiol, serial dilutions paralleled the standard curve of the kit, and raw-spiked plasma differed from raw plasma by ∼1000 pg/ml. No dramatic differences were observed between extracted samples assayed in buffer and those assayed in raw plasma; therefore, we did not extract hormone from experimental samples and instead assayed the raw plasma directly.
We ran 50 μl aliquots of each sample along with eight estrogen standards (6.6–4000 pg/ml) in a single assay following the protocol of the kit. Some samples were lost during preparation; therefore, only seven samples were assayed for each experimental group. Most samples and all of the kit standards were run in duplicate; however, three samples in each experimental group were run singly because of insufficient sample volume. Briefly, we incubated each sample with 50 μl of E2 antiserum and 50 μl of an E2-acetylcholinesterase conjugate for 1 h. After emptying and washing the plate, we added 200 μl of enzymatic substrate (Ellman's reagent) to all sample wells. After a 1 h incubation, we read the plate immediately at 405 nm on a Dynex MRX II microplate reader.
We plotted the optical densities of the kit standards as a function of known E2 concentration and fit the points with a sigmoid 4PLC equation; sample hormone levels were extrapolated from this standard curve. Intraassay variability was 6.50%.
Statistics
Monotonic and non-monotonic neurons were analyzed separately. To measure the effect of breeding condition on tone and song-evoked |RS| values, we set breeding condition as the between-subjects variable and stimulus level as the within-subject variable in two-way repeated-measures mixed-model ANOVAs. For some cells, we had an incomplete dataset, such that a given stimulus (tone or song) was only presented for a limited range of intensities. These missing values presented an obstacle for running a repeated-measures ANOVA. We therefore performed each ANOVA twice: In one version, we included all the cells in the dataset and discarded any stimulus level with missing values. In the other version, we discarded any cells that had missing values and included all the stimulus levels. Both of these versions gave similar results; therefore, we report here only the results obtained when all cells were included in the ANOVAs.
We used a Mann–Whitney U test to compare E2 levels across experimental conditions. All correlations (between song and tone thresholds or between hormone levels and firing rates) were assessed with Pearson's r. For the remainder of our analyses, we indicate which statistical tests were used in table legends, or in Results, when appropriate. Unless otherwise stated, all values are reported as means ± SEMs. All statistical analyses were made using PASW Statistics 18.0 or GraphPad Prism.
Results
Plasma E2 levels
Females housed under breeding (LD+E2) condition had elevated levels of plasma E2 compared with females housed under nonbreeding (SD) condition (397.8 ± 187.5 vs 26.3 ± 8.13 pg/ml; Mann–Whitney U = 1.000; n1 = n2 = 7; p = 0.003). Plasma E2 levels in birds housed under breeding condition were similar to the physiological range reported by Wingfield and Farner (1978) for wild breeding female white-crowned sparrows (∼300–500 pg/ml).
Auditory responses of field L neurons
We recorded from a total of 77 auditory-responsive cells histologically confirmed to be in field L (Fig. 1C–E). Of these, 30 auditory cells were recorded from 9 birds in breeding condition and 47 cells were recorded from 12 birds in nonbreeding condition (Table 1). For some cells, we were only able to record song responses (either because the cell was unresponsive to tones, or because we could not hold the isolation long enough to record a full tone response area). Similarly, in another subset of cells, we were only able to record tone responses. We were able to record both song and tone responses in a final subset of cells.
Tone responses
Tone responses can be monotonic or non-monotonic
Tone-responsive neurons in field L can be categorized as monotonic or non-monotonic, based on the shape of their rate level function. Monotonic neurons increase their firing rate with increasing stimulus intensities (Fig. 2A,C). Conversely, the firing rate of non-monotonic neurons increases up to some midlevel stimulus intensity before decreasing at higher intensities (Fig. 2B,D).
We calculated a MI for each neuron to determine whether monotonic and non-monotonic cells were two separate populations. The results of this analysis are shown in Figure 2E. While there is a small amount of overlap between the two groups, the distributions clearly segregate from one another. Only 3 of 25 neurons classified as non-monotonic have MIs > 0.70. Each of these neurons had classic “inverted V”-shaped rate level functions, up through 80 dB SPL. At 90 dB SPL, each of these cells showed an increase in activity, such that their overall rate level function was “N” shaped. This “N” shape accounted for the high MI values in these cells. If the MI was instead calculated using 80 dB SPL as the maximum stimulus amplitude, each of these cells showed values of MI < 0.70.
Similarly, only 2 of 28 cells that were classified as monotonic had values of MI < 0.70. Both of these cells showed rate level functions that saturated at midlevel intensities, with a small decrease in firing rate at 90 dB SPL. This decrease, although not large enough for us to classify the cells as non-monotonic, accounts for the lower MI values.
When we compared the groups using a two-sample t test, we found that monotonic neurons had significantly higher MIs than non-monotonic neurons (0.887 ± 0.025 vs 0.410 ± 0.054; t(51) = 8.331; p < 0.001). Together, these findings suggest that the monotonic and non-monotonic cells we report on here likely comprise two distinct populations of neurons.
Monotonic (n = 28) and non-monotonic (n = 25) cells were equally abundant in field L, and breeding condition had no effect on their relative proportions. Similarly, spike half-widths of monotonic and non-monotonic neurons remained stable across breeding conditions. These results are presented in more detail with their accompanying statistics in Table 2.
The anatomical positions of monotonic and non-monotonic neurons did not differ across the anterior–posterior and dorsal–ventral extents of field L. Individual recording sites overlapped along both the rostral–caudal and dorsal–ventral axes (Fig. 3), and breeding condition had no effect on the spatial distribution of monotonic or non-monotonic neurons (Table 2).
Breeding condition does not affect CF distributions or frequency tuning
Tone-responsive neurons in the avian auditory forebrain are tuned to specific frequencies arranged in a topographic manner (Müller and Leppelsack, 1985; Wild et al., 1993). We investigated whether the CF distributions for monotonic and non-monotonic neurons differed between breeding conditions. Breeding condition had no effect on the distribution of characteristic frequencies in monotonic or non-monotonic neurons. We also quantified tuning precision by calculating frequency bandwidths 10 dB above the best threshold of each neuron. No effect of breeding condition was observed on frequency bandwidths for monotonic or non-monotonic cells. Detailed results and accompanying statistics can be found in Table 2.
Breeding condition increases spontaneous and maximum firing rates in monotonic neurons
Previous work has suggested that E2 increases neuronal responsiveness in NCM, a secondary region of the songbird auditory forebrain that expresses estrogen receptors (ERs) (Tremere et al., 2009; Maney and Pinaud, 2011; Tremere and Pinaud, 2011) (Fig. 1B). To determine whether E2 has similar effects on field L neurons, a region that does not express steroid receptors, we calculated average spontaneous and maximum evoked firing rates from cells in birds under different breeding conditions (Fig. 4). E2 treatment significantly increased spontaneous firing rates of monotonic neurons. Similarly, monotonic cells showed a trend toward an increase in maximum evoked firing rates at CF. Spontaneous and maximum firing rates increased by the same relative amount, however, such that the firing rate range of these cells remained constant across breeding and nonbreeding conditions. Table 3 provides the statistical results of these comparisons.
Figure 4 also illustrates that E2 had different effects on non-monotonic neurons. While breeding condition did not have a significant effect on spontaneous firing rates of non-monotonic neurons, E2 treatment significantly decreased maximum evoked firing rates of these cells. This combination resulted in a significant decrease in the firing rate range of non-monotonic neurons. The associated statistics for these comparisons are listed in Table 3.
Breeding condition increases tone-evoked response strength and sensitivity of monotonic neurons
The effects of E2 on maximum evoked firing rates could be explained by an overall shift in evoked firing rates across stimulus levels, and/or a change in the shape of the rate level function, both of which could give rise to changes in auditory thresholds and dynamic ranges. To address this issue, we calculated RS level functions at CF for monotonic and non-monotonic neurons under different breeding conditions.
Figure 5A shows group RS data for monotonic neurons across stimulus level; accompanying statistical results are listed in Table 4. Input–output functions had similar shapes across experimental conditions, peaking at 80 dB SPL under breeding condition and 90 dB SPL under nonbreeding condition. The effect of sound intensity was significant. In addition, breeding condition significantly increased monotonic tone RS values across levels, by an average of 9.08 sp/s. The interaction between breeding condition and tone intensity on monotonic tone RS was also significant, such that the largest differences between the experimental groups occurred at midlevel intensities. Table 4 shows the accompanying statistics for this analysis.
The effect of breeding condition on overall response magnitude resulted in differences in auditory sensitivity. Breeding condition significantly lowered CF thresholds compared with nonbreeding condition in monotonic cells (Fig. 5C). The E2-induced decrease in threshold contributed to a slight, but nonsignificant increase in monotonic neuron dynamic range (Fig. 5E). The statistics for these comparisons can be found in Table 3.
Figure 5B shows group RS data for non-monotonic neurons across stimulus level. As above, input–output functions for breeding and nonbreeding groups had similar shapes, peaking at 60 and 50 dB SPL, respectively, and the overall effect of sound level was significant. In contrast to the monotonic neurons, breeding condition significantly decreased tone RS values in non-monotonic neurons across stimulus levels by an average of 5.76 sp/s. The interaction term between level and breeding condition was not significant and no effect was found on CF threshold (Fig. 5D) or on dynamic ranges (Fig. 5F) in these cells. The results of these statistical analyses can be found in Tables 3 and 4.
To summarize the preceding results, breeding condition increased spontaneous firing rates, maximum evoked firing rates, tone-evoked response strengths, and pure-tone sensitivity in monotonic, but not non-monotonic neurons.
Song responses of field L neurons
Song responses can be excitatory or suppressive
Previous work has shown an effect of E2 treatment on selectivity and discrimination of conspecific song stimuli in secondary auditory forebrain regions (Maney et al., 2006; Tremere et al., 2009; Remage-Healey et al., 2010, 2012; Sanford et al., 2010; Tremere and Pinaud, 2011; Remage-Healey and Joshi, 2012). All of these studies presented song at a single intensity level, however. Therefore, before determining whether E2 affects field L song response properties, we first examined song-evoked rate level functions in individual cells. We observed that, while the majority (40 of 58) of responses to conspecific song were excitatory (Fig. 6A,C), increasing their rate as a function of song level, a substantial portion of them (18 of 58) were suppressive (Fig. 6B,D). Breeding condition did not influence the relative proportions of excitatory or suppressive song responses in field L (Table 5). To determine whether breeding condition affects song-evoked excitability, we calculated the maximum song-evoked |RS| for each cell; we found no effect of breeding condition (Table 5).
Breeding condition increases song-evoked response strength and dynamic range of cells with monotonic tone responses
We used the absolute value of response strength (|RS|) to analyze the change in neuronal firing rate for all song responses together. Song |RS| values increased as a function of song level in both breeding and nonbreeding groups (F(4,56) = 14.46; p < 0.001). E2 treatment, however, did not significantly affect rate level shapes or magnitudes (F(1,56) = 0.075; p = 0.785), and no interaction between breeding condition and song level was observed (F(4,56) = 0.313; p = 0.870). As noted in Table 5, breeding and nonbreeding groups also had similar song thresholds and dynamic ranges.
Thus, our results show that, when all neurons in our sample are considered, E2 treatment has no effect on song responses. Given that E2 treatment modulated tone responses in a selective manner, however, we analyzed song responses separately for different classes of neurons.
Tone and song thresholds were correlated within individual cells for both breeding (r = 0.60; n = 15; p = 0.019) and nonbreeding (r = 0.61; n = 19; p = 0.006) groups (Fig. 7). Song thresholds were higher than tone thresholds. This finding is not surprising, given that tone thresholds were measured at CF, the optimal tonal stimulus of the unit.
The correlation between song and tone thresholds led us to predict that E2 treatment enhances song responses, but only in neurons with monotonic input–output functions in response to pure-tone stimuli. To examine this issue, we examined song-evoked |RS| level functions separately for cells that had monotonic and non-monotonic tone input–output functions. For cells that had monotonic tone responses, there was a significant effect of sound intensity on average song-evoked |RS| values under both breeding and nonbreeding conditions (Fig. 8A). E2 treatment significantly increased song-evoked |RS| values in these cells by an average of 2.578 sp/s across levels. Importantly, there was a significant interaction between song intensity and breeding condition; while breeding condition had a small impact at even at the lowest intensity tested, this effect became more pronounced as song intensity increased. The greatest difference between conditions was observed at 90 dB SPL. Because the greatest shift in the input–output function occurred at higher stimulus levels, there was no significant change in song threshold (Fig. 8C). In addition, there was a trend for breeding condition to increase song dynamic range in cells with monotonic tone responses (Fig. 8E), but this trend failed to achieve statistical significance. The results of these statistical analyses can be found in Tables 3 and 4.
Average song-evoked |RS| values are plotted as a function of sound level for cells that had non-monotonic tone responses in Figure 8B. In these cells, the effect of level was not significant. E2 treatment did not significantly alter |RS| values across sound intensity, nor was there a significant interaction between song intensity and breeding condition (Table 4). Finally, breeding condition did not have a significant effect on song thresholds (Fig. 8D) or song dynamic ranges (Fig. 8F) in cells that had non-monotonic tone responses (see Table 3 for associated statistics).
In summary, breeding condition increased song-evoked response strengths and dynamic ranges in neurons with monotonic tone responses, but not neurons with non-monotonic tone responses.
Plasma E2 concentrations predict firing rates and response strengths
The observations that breeding condition influenced auditory response properties in a select subset of field L neurons (Figs. 4, 5, 8) led us to ask whether plasma E2 concentrations in individual birds correlate with single-unit firing rates or response strengths. To address this question, we compared the response properties of neurons from individual animals with the circulating level of plasma E2. As shown in Figure 9, plasma E2 concentrations were positively and significantly correlated with spontaneous firing rates (r = 0.71; n = 18; p < 0.001) and maximum evoked firing rates (r = 0.66; n = 18; p = 0.003) of monotonic neurons (Fig. 9A). Plasma E2 concentrations did not correlate with either spontaneous or evoked firing rates in non-monotonic cells (Fig. 9B). Similarly, while systemic E2 levels positively predicted both tone-evoked (Fig. 9C) and song-evoked (Fig. 9D) response strengths in cells with monotonic rate level functions to pure tones, there was no correlation between E2 and response strengths in cells with non-monotonic tone rate level functions (Fig. 9E,F). The response strengths shown in Figure 9C–F were all elicited at 50 dB SPL; we observed similar results at all other sound levels tested (data not shown).
Thus, spontaneous firing rates, maximum firing rates, and sound-evoked response strengths of monotonic, but not non-monotonic neurons, are all modulated by plasma E2 in a dose-dependent manner.
Discussion
Hormonal regulation of auditory processing in the CNS
The influence of sex steroid hormones on central auditory processing has received considerable attention, particularly for its clinical relevance. The latency of auditory brainstem responses (ABRs) change across the menstrual cycle and after hormone replacement therapy in adult women (Al-Mana et al., 2008, 2010). In addition, sound localization is impaired in women with Turner's syndrome, a chromosomal abnormality that results in estrogen deficiency (Hederstierna et al., 2009). Recent work in both humans and rodents has demonstrated that ERs are expressed widely in the mammalian auditory system, including auditory cortex (Stenberg et al., 2001; Charitidi et al., 2010; Tremere et al., 2011). Whether plasma hormones affect the response properties of single neurons in the mammalian auditory cortex, however, is currently unknown. One group has reported that mouse cortical multiunit responses to pup isolation calls differ between mothers and virgins, but the relative contributions of hormonal state and pup care experience cannot be separated in these experiments, and these two variables may interact (Miranda and Liu, 2009). In the current study, we demonstrate that single-unit auditory function in the telencephalon of an avian species is modulated by circulating reproductive hormones in a dose-dependent manner. Together, these findings highlight the need for detailed neurophysiological investigations of the mammalian auditory cortex under carefully controlled hormonal conditions.
The majority of work investigating hormonal modulation of central auditory function focuses on the rapid action of brain-derived E2 to increase neuronal responsiveness in the songbird nucleus NCM (Pinaud and Tremere, 2012). NCM is a secondary nucleus downstream of field L and is specialized for conspecific song processing (Mello et al., 2004). In zebra finches, direct infusion of E2 into NCM increases single-unit evoked firing rates both locally in NCM and downstream in HVC (Tremere et al., 2009; Remage-Healey et al., 2010; Tremere and Pinaud, 2011; Remage-Healey and Joshi, 2012). Here, we report that E2 increases neuronal responsiveness in the primary auditory forebrain, indicating that the central effects of sex steroids are not limited to higher processing regions, but extend more generally within the auditory pathway. Surprisingly, the influence of sex steroids on auditory thresholds has never been assessed at a single-unit or multiunit level in the telencephalon. We show that monotonic field L cells have lower pure-tone thresholds and expanded song dynamic ranges under breeding condition. These results indicate that hormones do not simply modulate specialized forebrain processing tasks, such as neural song selectivity or discrimination. Instead, E2 also modulates fundamental aspects of auditory forebrain function across a wide range of stimulus intensities.
Cellular basis of E2 modulation of field L neurons
In NCM, blockade of ERs decreases neuronal activity (Tremere et al., 2009; Tremere and Pinaud, 2011), suggesting that E2 influences neuronal responses by binding directly to ERs. Field L does not express classical ERs (Jeong et al., 2011; Maney and Pinaud, 2011), and expresses little to no GPR30 (a nonclassical ER) in adulthood (Acharya and Veney, 2012), but demonstrates a clear sensitivity to E2. Here, we discuss multiple possibilities for the cellular basis underlying estrogenic modulation of field L neurons.
One possibility is that E2 directly modulates activity in an area upstream of field L that contains ERs. In songbirds, ERs are absent in the auditory thalamus and midbrain (Gahr et al., 1993; Gahr, 2001). While no systematic study has examined ER expression in the songbird auditory brainstem, ERα is expressed in three chicken brainstem nuclei: magnocellularis, angularis, and laminaris (Y. Wang and E. W. Rubel, unpublished observations). Additionally, ERα is expressed in hair cells and support cells of the zebra finch inner ear (Noirot et al., 2009), and in the cochlear ganglion of Gambel's white-crowned sparrows (Y. Wang, E. A. Brenowitz, and E. W. Rubel, unpublished observations). All of these auditory regions are possible candidates for direct estrogenic influence.
Similarly, field L activity may be modulated by descending input from efferent regions that express ERs. Field L's only known source of top–down input is from the caudolateral mesopallium (Vates et al., 1996; Reiner et al., 2004) (Fig. 1B), a secondary auditory region that lacks ER expression (Gahr, 1990, 2001; Gahr et al., 1993; Metzdorf et al., 1999). E2 modulation could be initiated instead in brain regions that are indirectly connected with field L. NCM, which expresses ERs (Bernard et al., 1999; Gahr, 2001; Saldanha and Coomaralingam, 2005; Jeong et al., 2011; Maney and Pinaud, 2011), is reciprocally connected to field L via three synapses, which pass through the medial and lateral portions of the caudal mesopallium (Fig. 1B). Additionally, the cup of the robust nucleus of the arcopallium is auditory responsive (Mello and Clayton, 1994) and may express ERs (Gahr et al., 1993). The cup sends projections to the shell of nucleus ovoidalis (Mello et al., 1998), which provides input into field L (Bonke et al., 1979; Vates et al., 1996). None of these pathways can be ruled out at this time.
Another possibility is that E2 modulates field L activity via monaminergic signaling. The songbird auditory system receives catecholaminergic innervation and these inputs are sensitive to hormonal state (Maney and Pinaud, 2011). For example, sex steroids regulate catecholamine turnover in field L (Barclay and Harding, 1988, 1990). In female white-throated sparrows (a congener to the white-crowned sparrow), systemic E2 increases the number of catecholaminergic cells in locus ceruleus (LeBlanc et al., 2007) and increases the density of monoaminergic fibers in the auditory midbrain and forebrain (Matragrano et al., 2011, 2012). Furthermore, monoamines modulate songbird auditory forebrain physiology (Dave et al., 1998; Shea and Margoliash, 2003; Cardin and Schmidt, 2004) and behavioral responses to song playback (Appeltants et al., 2002; Riters and Pawlisch, 2007; Vyas et al., 2009; Pawlisch et al., 2011). Future experiments should test whether intact monoamine signaling is necessary to mediate the effects of systemic E2 on field L neurons.
Dose-responsive effects of E2 on central sensory physiology
Few studies have addressed whether the effects of circulating E2 on central sensory physiology scale with hormone concentration in a graded manner or are exerted in an all-or-none fashion once hormone levels reach some critical level (Oshima and Gorbman, 1969). Here, we demonstrate that in monotonic neurons, firing rates and response strengths gradually increase with increasing plasma E2. These findings suggest that individual auditory responsiveness is maximal when plasma E2 is highest (during courtship and copulation) (Wingfield and Farner, 1978) and is less sensitive at other times during the breeding season, when intersex communication may be less important. Given the metabolic cost of increased neural activity (Niven and Laughlin, 2008), a graded hormonal effect may serve to reduce unnecessary energy expenditure postmating, when other behaviors associated with the breeding season (e.g., feeding young, molting) occur.
Cell-specific effect of E2
We found that E2 has robust effects on monotonic auditory function while leaving non-monotonic cell processing unchanged. While the precise roles of monotonic and non-monotonic cells in auditory coding are still a matter of speculation, several hypotheses have been proposed. One of these hypotheses, the level tolerance model, suggests that non-monotonic neurons maintain sound source identity over a wide range of intensities, allowing the frequency content of a complex stimulus to be encoded by neuronal firing rates without the confounding influence of stimulus level (Sadagopan and Wang, 2008). If this model is true, then our findings suggest that E2 may enhance sound responses in monotonic neurons to allow better signal detection in the breeding season, while the stability of non-monotonic cells might ensure that signal identity remains constant under variable listening conditions. Maintaining a consistent representation of sounds across seasons could be important for the accurate recognition of species or individuals within a flock.
Disparate effects of E2 within auditory pathway
In a previous study, systemic E2 treatment elevated ABR thresholds in female white-crowned sparrows (Caras et al., 2010). This result seems to contradict the findings presented here. To reconcile this apparent discrepancy, it is important to note a few important differences between the studies. First, the ABR is a population response, generated by electrical activity in the auditory nerve and brainstem (Hall, 2007). Because no particular pure-tone frequency will elicit an optimal response in all neurons contributing to the ABR, the threshold is actually a measurement of sensitivity to a suboptimal stimulus. Conversely, in the current study, we calculated threshold at the optimal stimulus of an individual neuron: its CF. Furthermore, because the ABR is a pooled response recorded far-field (Jewett et al., 1970; Jewett and Williston, 1971), it is better described as a measure of neural synchrony, as opposed to firing rate. It is therefore difficult to compare the two measurements directly. Regardless of the methodological differences between the two studies, the possibility still remains that sex steroids have heterogeneous effects on separate portions of the ascending auditory system. This divergence could be explained by differences in hormone receptor expression patterns or mechanisms of hormonal action, as discussed above.
Footnotes
This work was supported by National Institutes of Health–National Institute on Deafness and Other Communication Disorders Grants F31DC010938, R01DC003829, and P30DC004661, the Seattle Chapter of Achievement Rewards for College Scientists Foundation, and the Washington Research Foundation. We thank Brandon Warren, David Schneider, and Sarah M. N. Woolley for technical assistance, Christine Portfors for her advice, and members of the Brenowitz and Rubel Laboratories for constructive discussion and support.
The authors declare no competing financial interests.
- Correspondence should be addressed to Dr. Edwin W. Rubel, Virginia Merrill Bloedel Hearing Research Center, University of Washington, Mail Stop 357923, Seattle, WA 98195. rubel{at}uw.edu