The Neural Substrate for Binaural Masking Level Differences in the Auditory Cortex

The binaural masking level difference (BMLD) is a phenomenon whereby a signal that is identical at each ear (S0), masked by a noise that is identical at each ear (N0), can be made 12–15 dB more detectable by inverting the waveform of either the tone or noise at one ear (Sπ, Nπ). Single-cell responses to BMLD stimuli were measured in the primary auditory cortex of urethane-anesthetized guinea pigs. Firing rate was measured as a function of signal level of a 500 Hz pure tone masked by low-passed white noise. Responses were similar to those reported in the inferior colliculus. At low signal levels, the response was dominated by the masker. At higher signal levels, firing rate either increased or decreased. Detection thresholds for each neuron were determined using signal detection theory. Few neurons yielded measurable detection thresholds for all stimulus conditions, with a wide range in thresholds. However, across the entire population, the lowest thresholds were consistent with human psychophysical BMLDs. As in the inferior colliculus, the shape of the firing-rate versus signal-level functions depended on the neurons' selectivity for interaural time difference. Our results suggest that, in cortex, BMLD signals are detected from increases or decreases in the firing rate, consistent with predictions of cross-correlation models of binaural processing and that the psychophysical detection threshold is based on the lowest neural thresholds across the population.


Introduction
An important sensory function is to detect signals in adverse conditions. In hearing, spatial separation can lead to a dramatic improvement in signal detectability. The binaural masking level difference (BMLD) is often taken as a measure of this ability: a tone signal that is identical at the two ears (S0), masked by a noise that is also identical at the two ears (N0), can be made 12-15 dB more detectable by sign-inverting either signal or noise at one ear (S or N) (Hirsh, 1948a, b;Licklider, 1948). The interaural inversion generates interaural time differences (ITDs) not unlike those that arise as a result of spatially separating the signal and noise.
Large BMLDs are only found at low frequencies (Ͻ1.5 kHz) and arise from the brainstem machinery sensitive to minute differences in the sound arrival times at the two ears that arise as a result from spatial separation (Jeffress, 1948;Colburn, 1977). Initial attempts to find the neural mechanisms for this unmask-ing in the brainstem and midbrain (inferior colliculus [IC]), by comparing firing rates to an N0 noise and S0 signal (N0S0) with those to the same N0 noise but an S signal (N0S), were largely unsuccessful (Langford, 1984;Caird et al., 1991).
Subsequently, studies in the IC, using signal detection methods, revealed neural responses consistent with the human psychophysically measured BMLDs (McAlpine et al., 1996;Jiang et al., 1997a, b;Palmer et al., 2000;Lane and Delgutte, 2005;Asadollahi et al., 2010). N0 and N noises drive neurons according to their sensitivity to ITDs. In most neurons, N0 noises drive the neuron well, and S0 signals can be detected by increased firing rate, whereas S signals produce a decrease in firing rate. In contrast, N noise tends to drive neurons poorly, but the S0 tones are still detected by an increase in firing rate. Masked thresholds for populations of neurons responding to N0S0, N0S, and NS0 stimuli are consistent with human psychophysics (Hirsh, 1948a, b) in that (1) they show the lowest thresholds near the signal frequency; (2) N0S0 thresholds tend to be greater than N0S and NS0 thresholds; and (3) N0S thresholds tend to be lower than NS0 thresholds. It was concluded that different neuron populations contribute to signal detection in different binaural configurations (Jiang et al., 1997a).
Although the IC plays an important role in binaural processing, perception must ultimately depend upon the activity in cortex. Given compelling evidence for widespread convergence from the midbrain into the cortex and extensive feedback loops incorporating the medial geniculate body, encoding of BMLD stimuli in the cortex could be quite different. However, there are few studies of binaural processing between the IC and cortex. Here we apply similar methods as in IC to investigate neural responses to BMLD stimuli in auditory cortex.
As in the IC, neural responses to BMLD stimuli in auditory cortex were consistent with predictions from binaural crosscorrelation models (for review, see Colburn and Durlach, 1978;Stern and Trahiotis, 1995;Colburn, 1996).

Materials and Methods
Guinea pigs from an in-house maintained breeding colony (18 male, 16 female weighing between 282 and 974 g) were used for the neurophysiological recordings. All experiments were performed under the terms and conditions of licenses issued by the United Kingdom Home Office under the Animals Scientific Procedures Act 1986, project license number 4003049, and the approval of the ethical review committee of the University of Nottingham.
Surgical procedures. The methods are similar to those used previously (see Jiang et al., 1997a, b). Urethane (0.9 -1.3 g/kg in a 20% solution, i.p.) was used to induce anesthesia. Subsequent analgesia, as determined by suppression of the pedal withdrawal reflex, was maintained with intramuscular injections of 0.2 ml Hypnorm (fentyl citrate 0.315 mg/ml, fluanisone 10 mg/ml, i.m.). Body temperature was maintained at 38°C using a rectal probe and a heating blanket (Harvard Apparatus Homeothermic Blanket Control Unit 50787). A premedication of 0.2 ml atropine (600 l/ml, s.c.) was administered to reduce bronchial secretions. The trachea was cannulated to reduce dead space and allow the animals to be artificially respired with 100% oxygen throughout the experiment (Harvard Apparatus model 970 ventilator). End-tidal carbon dioxide levels and heart rate (via electrodes either side of the thorax) were monitored (Vetspecs VSM8).
Polythene tubes (0.5 mm inner diameter, 250 -400 mm length) were inserted and sealed into the bullae to allow pressure equalization while maintaining closed field sound presentation. An opening was made in the connective tissue above the foramen magnum to release pressure variations in the cerebrospinal fluid to increase recording stability.
Clear access to the auditory meatus was achieved by removing part of the tragus. The animal was placed into a stereotaxic frame with earbars consisting of hollow perspex speculae so that the tympanic membrane were visible.
The skull overlying the auditory cortex was cleared, and a craniotomy ϳ6 mm in diameter was made on the right side to reveal the primary auditory cortex. The dura mater was removed and the exposed cortex covered with agar (1.5% agar in 0.9% normal saline) to avoid desiccation and to stabilize recordings. Recordings were taken from low-frequency neurons in the primary auditory cortex area A1. In guinea pigs, core area A1 is situated caudal and ventral to the pseudosylvian sulcus (Wallace et al., 2000) with low-frequency units found at the rostral end. Electrophysiological response properties were used to confirm the position of the recording site within the low-frequency area of A1.
Arrays of four glass-coated tungsten microelectrodes were used to record extracellular action potentials (Bullock et al., 1988). These arrays were advanced together using a piezoelectric motor (Burleigh Inchworm IW-700/710).
Stimulus presentation and neuronal recordings were achieved using BrainWare (version 9.19 Jan Schnupp, Oxford University) software and Tucker-Davis Technologies (TDT) hardware.
Neural recording. The action potentials of single neurons were isolated using 50 ms tone and/or noise bursts as the search stimuli. The raw signals were recorded and bandpass filtered (0.16 -6000 Hz) using a high-impedance headstage (TDT RA16AC) and then digitized (TDT RA16PA). The digitized action potentials were further bandpass filtered (300 -3000 Hz) and amplified using a digital signal processor (TDT RX7), controlled by the BrainWare software. Short portions of the units' action potentials were recorded whenever the amplitude crossed a predetermined threshold, set individually for each unit. These action potentials were analyzed with Plexon Offline Spike Sorter (version 2.8.8). Principal component analysis was used to cluster them into groups that originated from the same neuron. This method allows single-unit activity to be accurately and objectively separated from any multiunit activity.
Stimulus generation. All stimuli were generated with BrainWare using a TDT RX8 Digital Signal Processor, which contains a 24-bit sigma-delta digital-to-analog converter. The signal levels were controlled with programmable attenuators (TDT PA5). The maximum output of the sound system was limited to ϳ100 dB SPL. Stimuli were presented either binaurally or monaurally within sealed acoustic systems. All stimuli were gated on and off in software with 2 ms rise/fall time cosine squared windows (TDT Cos2Gate). The speakers used were custom modified Radioshack 40 -1377 tweeters fitted into the hollow speculae via critically damped tubes (diameter 2.5 mm, length 24 mm: M. Ravicz, Eaton Peabody Laboratory).
The sound system was calibrated at the beginning of each experiment using a 1 ⁄2" Brüel and Kjaer 4134 condenser microphone connected to a calibrated 1 mm probe tube (Brüel and Kjaer DB 0241). The end of the probe tube was positioned within the speculum in close proximity to the tympanic membrane. The calibration sound was white noise, presented 20 times. The system response was calculated by Fourier transforming the microphone waveform and correcting for microphone sensitivity and probe-tube characteristics.
Recording paradigms. The 100 ms tone bursts covering 4 octaves were presented once in pseudo-random order to obtain the neurons' frequency response areas. The tones were presented identically to both ears and repeated once every 400 ms in 5 dB steps over a sound level range from 0 to 100 dB SPL. Spikes were counted within a 10 -120 ms window following the tone onset. The frequency response area is the 2D pattern of spike counts as a function of frequency and level. Best frequencies were derived offline using the automated method described previously (Palmer et al., 2013;Sumner and Palmer, 2012) with visual confirmation.
Rate-level functions were measured to 100 ms broadband (50 -5000 Hz) noise bursts. Like the tones, the noise bursts were presented once every 400 ms and spikes were counted within a 10 -120 ms window after burst onset. The levels covered the range from 0 to 100 dB SPL in 5 dB steps. At each level, the stimuli were presented either binaurally or monaurally to left or right ear, yielding three noise rate-level functions for each neuron. The three rate-level functions were obtained simultaneously with all levels for each function presented once in a pseudo-random order before repeating in a different order (100 repeats).
Sensitivity to ITDs was assessed by measuring firing rate as a function of ITD for broadband (50 -5000 Hz) noise and a 500 Hz tone. The noise was presented at a spectrum level of 23 dB SPL (60 dB SPL overall), and the tone was presented at 75 dB SPL. The ITDs ranged from Ϫ2000 s to 2000 s in 40 equal steps of 100 s. The stimuli were 100 ms in duration and repeated once every 400 ms. Spikes were counted within 10 -120 ms window after stimulus onset. The noise and tone rate-ITD functions were measured separately. For each, all ITDs were presented once in a pseudorandom order before repeating (20 repeats noise, 50 repeats tone).
Masked rate-level functions (MRLFs) were measured to S0 and S signals in the presence of either N0 or N masking noise. The signal was a 100 ms 500 Hz pure tone, and the noise was 100 ms broadband white noise (filtered: 50 -5000 Hz) with a 23-dB SPL spectrum level. The signal and noise were gated on and off simultaneously and presented once per 400 ms. Spikes were counted within a window 10 -120 ms after stimulus onset. The signal level was varied pseudo-randomly in 5 dB steps over a range from 0 to 100 dB SPL. The different interaural configurations of the signal and noise (N0S0, N0S, NS, and NS0) were presented in a pseudo-random order. MRLFs were constructed by plotting the total spike count to the signal and noise as a function of the signal level (50 repeats). Poststimulus time histograms (PSTHs) were formed with 1 ms bins by summing over all tone levels and phases separately for N0 and N maskers. Rasters were also plotted across tone level for each of the different interaural configurations of the signal and noise (N0S0, N0S, NS, and NS0). The PSTHs and raster plots were used to visually determine whether the spike rate exceeded the spontaneous rate at the onset (On), sustained (Sus), or offset (Off) of the response and classify the unit as On, OnOff, OnSus, OnSusOff, Sus, or Off. The variance of spike counts per stimulus as a function of spike count per stimulus were plotted on logarithmic axes and the slope calculated. This gives a measure of how closely the spike generation accords to a Poisson process (variance-mean ratio of 1).
Data analysis. The MRLFs were used to determine masked thresholds for the tone using a technique based on signal detection theory (Green and Swets, 1966). Here, an adapted version of the detectability index (dЈ) is used, known as the Standard Separation (D) (Sakitt, 1973). This calculation has been previously used to determine masked thresholds for binaural masking conditions in the guinea pig IC and has the advantage of not making any assumptions regarding the underlying distributions (Jiang et al., 1997a, b). At each signal level, l, D is calculated as follows: Here, r NϩS (l ) is the mean spike rate in response to the noise masked signal at level l, sd NϩS (l ) is the standard deviation (SD) of the spike rates across different repeats, r N is the mean spike rate across the lowest five signal levels, and sd N is the corresponding SD (calculated across the data for the five lowest levels pooled together).
In previous studies, the masked threshold has been defined as the lowest signal level at which the Standard Separation (D) reached an absolute value of 1.0. This value approximately equates to 75% correct in a two-alternative-forced-choice psychophysical task. However, in the present study, many MRLFs did not quite reach this criterion value. Consequently, here we used the signal level where D reached a value of 0.75 as masked threshold, which corresponds to 70% correct performance in a two-interval forced-choice experiment (Macmillan and Creelman, 2005, their Table A5.7).
Tone rate-ITD functions were analyzed using the method of Goldberg and Brown (1969) to obtain best ITD, vector strength, and Rayleigh statistic. Best ITDs are only reported when the Rayleigh test of uniformity (Buunen and Rhode, 1978) was significant at the 0.05 level (2nR 2 Ͼ 5.991, where n is the total number of spikes and R is the vector strength).
Noise rate-ITD functions were interpolated to 10 s resolution, smoothed with a 300 s half-width triangular window, and the ITD of largest peak of the smoothed function chosen as the best ITD. The significance of the peak was tested using bootstrapping. To do this, the responses for every ITD were first randomly redistributed within each repeat (in the experiment, all ITDs were presented before repeating). A simulated noise rate-ITD function was then formed and smoothed as above and the magnitude of the largest peak stored. This was repeated 2000 times to give a distribution of peak heights that would be obtained under the null hypothesis of no tuning to ITD. If the magnitude of the peak obtained from the original data was Ͼ95% of randomly obtained peaks, then the null hypothesis was rejected and the noise ITD accepted as significant at the 0.05 level.
The distribution of noise best ITDs was compared with that obtained from the IC (McAlpine et al., 2001). Because the IC distribution was shown to be strongly BF-dependent and the distribution of BFs sampled in the IC and cortex were different, the IC data were resampled to have the same BF distribution as obtained in cortex. The data were divided into 100-Hz-wide bands and as many samples taken from the IC as occurred in each BF band in the cortex. Sampling was done with replacement. A best ITD histogram was then formed. This procedure was repeated 1000 times to give an average best ITD histogram for IC with the same BF distribution as in cortex. To test whether the best ITD distribu-tion in cortex was different from IC, a 2 goodness-of-fit test was performed. All bins ϽϪ400 s were pooled, as were all bins Ͼ1000 s, to ensure each bin had a non-0 count. The resampled IC data were treated as the expected distribution, and the cortex data as the measured distribution.
Binaural model. One of the most established models of binaural processing is the cross-correlation model Stern and Trahiotis, 1995; for review, see Colburn, 1996). To compare the predictions of such a model with our physiological data, we implemented a version using a binaural toolbox (M. A. Akeroyd, personal communication). For each ear simulation, signals were first passed through matched gammatone filterbanks before applying a neural transduction process based upon the hair cell model of Meddis et al. (1990) with a high spontaneous rate. Filter outputs from the two ears were then delayed relative to one another to simulate neuron best ITDs. We assumed that all best ITDs are present in all frequency channels. The delayed signals were cross-multiplied, the cross-product integrated over the whole stimulus, and the resulting activity plotted as a function of frequency and best ITD (yielding a type of cross-correlogram, but without normalization; Fig.  12A-C,F-H ). The cross-products were averaged across frequencies in the range 300 Hz-800 Hz yielding a summary correlogram showing activity as a function of interaural delay ( Fig. 12 D, E). The frequencyaveraged binaural cross-products ( Fig. 12 D, E) were weighted by the distribution of best ITDs observed in the measured data (Fig. 13).

Results
Data were obtained from 165 single units in the primary auditory cortex of 34 guinea pigs. Responses were measured to a variety of stimuli to characterize the monaural and binaural responses of each unit and thereby provide explanatory leverage for the BMLD responses. An example of the full range of measurements for each unit is shown in Figure 1.
The frequency response area for this unit is shown in Figure  1A. This unit's best frequency was measured as 566 Hz. The signal for the MRLFs was a 500 Hz tone, and its position with respect to the response area is shown in Figure 1A (vertical line); the level of the masking noise is shown by the horizontal line.
Each unit's response to the noise alone, presented either monaurally to the left or right or binaurally, was measured as a function of noise level (Fig. 1B). These rate-level functions show that this unit is binaurally sensitive: it responds more strongly to noise presented to both ears than to either ear alone, even though the firing threshold is the same for noise presented monaurally or binaurally. The activity of the unit was modulated by the ITD of the signals: the ITD functions for the 500 Hz tone (Fig. 1C) and the broadband noise (Fig. 1D) both show a clear peak near 0 ITD. The "best ITD" (where the unit responded maximally) was calculated by finding the peak in each ITD function separately (after smoothing). For this unit, the tone and noise ITD functions have best ITDs of 153 and 100 s, respectively.
MRLFs were measured in response to the 500 Hz tone signal against the noise masker in four interaural configurations: N0S0 and N0S (Fig. 1E) and NS and NS0 (Fig. 1F ). Increasing signal level caused either an increase or decrease in firing rate depending on the stimulus interaural configuration and the best ITD of the unit (for more detail, see later). The rasters for these conditions are shown in Figure 1I and the PSTH pooled across conditions with the same noise phase in Figure 1H. In all conditions, this unit fires just after the onset of the signal, and rarely elsewhere, and so was classified as On. The variance of spike counts was proportional to the spike count, with a slope of 0.91, so firing statistics were close to a Poisson process (Fig. 1G). 500 Hz pure tone. Black bars represent best ITDs over an ITD range of Ϯ800 s. A, Best ITD calculated as maxima of the smoothed ITD function produced from responses to broadband noise. Only data from the 92 "Peak" units are shown. For comparison, gray bars represent best noise ITDs from the IC, resampled to have the same best frequency distribution as cortical data. B, Best ITD from the "best phase" calculation of Goldberg and Brown (1969) in response to a pure tone signal.

Population characteristics
The best frequencies of all units included in this study were Ͻ1400 Hz (range 60 -1345 Hz), with the majority having best frequencies ϳ600 Hz (Fig. 2, gray bars). The majority of single units recorded (64%: 106 of 165) gave a measurable masked threshold for at least one of the binaural conditions (Fig. 2, black bars). The majority of units (60%: 99 of 106) were also significantly sensitive to the noise ITD (Fig. 2, white bars). Because ITD sensitivity and masked thresholds were measured in different runs, it was possible for units that gave a binaural masked threshold and were obviously therefore phase sensitive to produce a nonsignificant noise ITD sensitivity function. There is no difference between the best frequency distributions of these groups.
The firing rate of the units was low, on average 1.0 (1.1 SD) spikes/stimulus in response to N0 noise and 0.62 (0.77 SD) spikes/stimulus in response to N. The majority of units (98%: 161 of 165) responded at the stimulus onset, 38% (63 of 165) responded in a sustained manner during the stimulus, and 18% (30 of 165) responded with a peak at offset. These responses were frequently combined within units; the most common type, how- Population distributions of the noise and tone best ITDs are shown in Figure 3 (in black). In line with convention, positive ITDs represent the contralateral, and negative ITDs the ipsilateral, space. For both the noise and tone, the majority of units had best ITDs Ͻ350 s within the contralateral hemifield. This is comparable with the range of ITDs calculated from head-related transfer functions (HRTFs) measured acoustically (Sterbing et al., 2003;Greene et al., 2014). Only 7 (7%) of the 99 delay sensitive units were trough types (Yin and Kuwada, 1983) (i.e., an approximately constant response across ITD except for a dip at the "worst" phase). These were included in the analysis but are not shown plotted in Figure 3.
Also shown in Figure 3A (in gray) is the best ITD distribution for IC neurons previously obtained (McAlpine et al., 2001), resampled so that the BF distribution was the same in IC as cortex (see Data analysis). Each distribution has a peak on the contralat- A, MRLFs and D functions for a unit with BF of 280 Hz and noise best ITD of Ϫ800 s, giving thresholds for all four conditions with an increase in firing rate. B, MRLFs and D functions for a unit (BF: 800 Hz; noise best ITD: 200 s), giving thresholds with an increase in firing rate for N0S0 and NS0 and a decrease in firing rate for N0S and NS. C, MRLFs and D functions for a unit (BF: 1130 Hz; noise best ITD: Ϫ300 s), giving thresholds for all four conditions with a decrease in firing rate. D, MRLFs and D functions for a unit (BF: 565 Hz; noise best ITD: Ϫ1100 s), giving thresholds for N0S0 and N0S with an increase in firing rate, NS0 with a decrease in firing rate, and no threshold for NS. eral side, with fewer units with a best ITD near 0. The two distributions look qualitatively different; however, the null hypothesis of no difference cannot be rejected ( 2 goodness of fit, 2 ϭ 40, df ϭ 28). Figure 4 shows the MRLFs for all four stimulus configurations in each of four example neurons. As described in detail in Materials and Methods, the masked threshold is estimated by calculating the standard separation, D, based on the spike rate difference between the signal-plus-noise conditions and the noise-alone condition at each signal level. The signal level at which D reaches a criterion value of 0.75 (in either positive or negative direction; see Fig. 4, horizontal lines) was taken as the unit's threshold for the given condition. In some cases, the MRLFs were nonmonotonic (e.g., Fig. 4 B, D) and therefore met the criterion more than once. When this occurred, the threshold was taken as the lowest signal level at which the criterion was met.

Masked thresholds
The criterion value was reached by either an increase or decrease in firing rate from the noise alone response. Thresholds based on an increase in firing rate (positive D values) are referred to as P-type thresholds, whereas those derived from a decrease in firing rate (negative D value) are referred to as N-type thresholds. Within a single unit, both decreases and increases in firing rate could be observed for different interaural configurations. Of the 106 neurons that yielded at least one measurable masked threshold, only 7 yielded thresholds for all four binaural configurations. The remaining 99 units were almost equally divided between those giving thresholds for one, two, or three of the interaural configurations. Figure 4D shows an example of a unit for which thresholds could be measured for three of the four conditions. The D values for NS (bottom right, black circle) did not reach the 0.75 criterion, so no threshold was recorded for this condition in this unit. All measurable thresholds were included in the population analysis, not just those where comparable homophasic and antiphasic thresholds were obtained in the same unit. Masked thresholds were measured from neurons with a variety of best frequencies (Fig. 5). Upward pointing triangles show thresholds for P-type units, and downward pointing triangles for N-type units. There is a wide range of thresholds in all conditions. Unsurprisingly, and in line with previous data, the lowest thresholds were observed in units with best frequencies close to 500 Hz (Jiang et al., 1997a, b). Figure 6 shows the number of P-and N-type thresholds for each condition, and the average masked thresholds for each group for units with BFs in the range 300 -800 Hz. Substantially more measurable thresholds were obtained for the two antiphasic conditions than for their homophasic counterparts (n: N0S0 ϭ 23, N0S ϭ 50, NS ϭ 6, NS0 ϭ 53). There was also a noticeable difference in the distributions of P-and N-type thresholds for the two antiphasic conditions. The majority of N0S thresholds were N-type, whereas the majority of NS0 thresholds were P-type. P-type responses yielded a lower average threshold for antiphasic than N-type responses; this was generally because the firing rate tended to increase when an signal was added to a noise but, in many conditions, did not reach threshold before the firing rate dropped nonmonotonically and reached the threshold criterion in a decreasing direction.
The wide range of thresholds raises the question of what factors are predictive of threshold. Although many units with BFs close to the signal frequency gave the lowest thresholds, others gave very high thresholds. PSTH response type was not predictive, and the masked thresholds were almost identical whether the response window was constrained to the onset only (10 -60 ms) or the whole response (10 -120 ms). Response windows constrained to the sustained response (70 -120 ms) gave thresholds in 44 units, these tended to be a few dB higher than to the whole response. Response windows constrained to the offset response (120 -170 ms) gave thresholds in 20 units, these were up to 20 dB higher. There were small effects of noise only firing rate and the SD of noise only firing rate (where the noise only response was defined as the bottom 20 dB of the MRLF). Neither of these measures had an effect on the homophasic thresholds but were barely significant (t test, p Ͻ 0.05) in the antiphasic conditions. Here units that gave a measurable masked threshold tended to have a lower firing rate and SD than those which gave no measurable masked threshold. However, although just significant, the effect was barely noticeable and is not illustrated. Unsurprisingly, the maximum rate across the MRLF was significantly (t test, p Ͻ 0.01) related to the ability to measure a masked threshold. This simply reflects the fact that units with a large dynamic range in response to the masked stimulus were able to cross threshold.
The standard separation, D, is a combination of two factors, the SD and the difference in firing rate between noise alone and noise plus signal conditions. The variation in one or both can determine threshold (Shackleton et al., 2003;Tollin et al., 2008). In a Poisson process, the variance is proportional to the firing rate, so they are not independent of each other. We measured the slope of the log-log plot of variance against spike count separately for each binaural masking condition (N0S0, N0S, NS, and NS0) as firing rate varied with signal level. The distributions were independent of condition so are plotted pooled together in Figure 7. The distribution is centered on one, suggesting that most units had nearly Poisson firing statistics. Two distributions are plotted depending upon whether a masked threshold was obtained (white bars) or not (black bars). Although the difference is slight, the distribution is biased toward lower slopes when a threshold was obtained (t test, p Ͻ 0.05 for N0S0, N0S, and NS0). This means that variance did not increase as rapidly as firing rate as firing rate increased. In conclusion, firing rate and the SD of firing rate are proportional to each other in these masked signal conditions, and each factor is almost equally important; however, for some units, the SD is lower than predicted from Poisson statistics, and this favors lower thresholds.

BMLDs
The relatively low number of homophasic thresholds measured means that intraunit BMLDs could only be calculated in a small number of cases (29 for N0S0 vs N0S, 10 for NS vs NS0). The majority of the single units for which both homophasic and antiphasic thresholds could be determined showed positive BM-LDs, with antiphasic thresholds lower than homophasic thresholds (37 of 39). Many of the intraunit BMLDs observed here, however, were much larger than is conventionally found in psychophysical experiments (12-15 dB): 20 units had BMLDs Ͼ20 dB. These very large BMLDs were due to the units showing very little sensitivity to the homophasic condition, resulting in an unusually high threshold for that condition.
A number of neurons in the population ϳ500 Hz had a very low sensitivity to certain interaural configurations, with the D values only reaching the criterion at very high signal levels. An overall mean of all thresholds for each condition would therefore give a population estimate biased toward these very high thresholds. However, it is plausible that the determination of the psychophysical threshold would be based upon the responses of the most sensitive neurons in the population (Jiang et al., 1997a, b;Parker and Newsome, 1998;Palmer et al., 2000). For this reason, the population threshold was defined as the lower quartile of the thresholds (25th percentile) measured for neurons with best frequencies between 300 and 800 Hz giving thresholds of N0S0 ϭ 57 dB, N0S ϭ 47 dB, NS ϭ 69 dB, and NS0 ϭ 47 dB. The choice of 25% was to a large extent arbitrary; it was chosen to be a more robust population estimate than an extreme value and to correspond to the quartile plotted by default in "box-plots." These thresholds are plotted as horizontal lines in Figure 5. The corresponding thresholds in the IC were N0S0 ϭ 53 dB, N0S ϭ 48 dB (Jiang et al., 1997a), and in a later series of experiments measured at a 10 dB higher noise spectrum level: N0S0 ϭ 57 dB, N0S ϭ 52 dB, and NS0 ϭ 56 dB (Palmer et al., 2000). NS was not measured and N0S not shown in the paper. It is not entirely clear why these thresholds are not proportional to masker level, appearing only 4 dB higher, but it is known that psychophysically BMLDs vary with overall masker level . Applying this correction gives a comparison IC threshold of NS0 ϭ 52 dB. In cortex, the antiphasic threshold is lower and the homophasic threshold higher than in the IC. The overall higher homophasic thresholds may be due to a more restricted firing rate range in cortex than IC. The very high homophasic thresholds tended to be due to nonmonotonic MRLFs, which started in one direction as signal level increased but did not reach threshold before they turned and reached threshold in the opposite direction. Antiphasic thresholds tended to be reached in the monotonic portion of the masked ratelevel curve. The box-plot summary of the thresholds of neurons with best frequencies between 300 and 800 Hz is shown in Figure 8 (n: N0S0 ϭ 23, N0S ϭ 50, NS ϭ 6, NS0 ϭ 53), and the binaural masking release between the 25th percentiles is shown as arrows. The binaural masking release effect observed psychophysically is replicated here. In both the N0S0-N0S and NS-NS0 comparisons, the homophasic threshold is higher than the antiphasic threshold, giving BMLDs of 10 and 22 dB, respectively (Fig. 8, arrows).

Effect of best ITD on responses to BMLD stimuli
To relate the response to the BMLD stimuli to the neurons' selectivity for ITDs, we computed the MRLFs for all neurons with best frequencies between 300 and 800 Hz summed across neurons with noise best ITDs in 100 s bins. Examples of these summed masked rate level functions are shown in Figure 9 for each of the binaural masking configurations and six best ITD bins between Ϯ300 s. We also show these data as the distribution of activity across best ITD for three different tone levels and for noise alone (Fig. 10). The three signal levels and the noise alone are indicated in Figure 9 (vertical lines; the black noise alone line is near the ordinate). The three signal levels range upward in 10 dB steps from near the estimated antiphasic population threshold.
The general shapes of the summed MRLFs in Figure 9 follow those of individual single-neuron MRLFs (compare with Fig. 4) and are predictable from the rate-ITD functions: the closer the best ITDs were to 0, the greater the response to interaurally identical stimuli (N0, S0) and the smaller the response to interaurally inverted stimuli (N, S). As a result, neurons within best ITD bins close to 0 tended to show a decrease in response with increasing signal level in the N0S condition, but an increase in the  N for B and D). The red, green, and blue lines are the summed neuronal responses at increasing signal levels (see key).
NS0 condition. As the signal level was increased Ͼ60 -70 dB SPL, the total response of these neurons decreased due to the large proportion of nonmonotonic MRLFs measured for N0S0 (ϳ40%; see, e.g., Fig. 1E where the firing rate at 80 dB is on the descending part of the nonmonotonic function). As shown in Figure 10, the effect of increasing the signal level can also be observed for each of the four interaural configurations across the range of best ITDs. For N0S0, increasing the signal level increased the total response slightly for neurons tuned to 100, 200, and 300 s ITD. In the N0S condition, the overall response of neurons with ITDs within a few hundred microseconds of zero in the contralateral hemifield decreased monotonically with increasing signal level. In the NS condition, an increase in signal level had almost no effect on the overall responses at any best ITD. This presumably is a floor effect: N is at the least optimal ITD for these units; hence, the baseline firing is low and the S signal has little effect in changing this rate. However, a large change was observed in the responses of neurons with best ITDs near 0 with the introduction of the S0 signal to N noise (Fig.  10D). The low baseline rate in response to the N noise is increased markedly by increasing levels of the S0 tone.
The representations of the neuronal responses shown in Figure 10 are highly weighted toward the most commonly measured best ITDs (50 -350 s); very few neurons in the present dataset had best ITDs outside of this range, which corresponds to the measured range of ecologically available ITDs (Sterbing et al., 2003;Greene et al., 2014).
As explained above, we would expect neural responses to be most different between in-phase and out-of-phase stimuli when the best ITD of the neuron is close to zero simply because the in-phase stimulus is at the peak ITD sensitivity and the out-ofphase at the trough. If the best ITD of the neuron is away from zero, then the responses to in-phase stimuli will be reduced and those to out-of-phase stimuli increased, so there is less modulation in response. We might, therefore, expect that antiphasic masked thresholds would be lower for neurons with best ITDs near 0 compared with those further from zero because there is more scope for the firing rate to be modulated by the signal. To test this idea, we plotted the masked threshold as a function of noise best ITD (Fig. 11) for each binaural condition. Fig. 11 (bottom) shows a box-plot of the thresholds at each best ITD; Fig. 11 (top) shows the number of neurons that contributed. It can be seen that, although there is a great deal of variability, both the median and threshold (lower quartile) for the antiphasic conditions are lowest near 0 best ITD.

Discussion
A priori it might be expected that there would be significant modification of the coding of BMLD stimuli in the medial geniculate body, between IC and cortex, and within the cortex because the activity in A1 is known to be modulated depending upon stimulus history and attentional state (e.g., Lee and Middlebrooks, 2013;Yin et al., 2014). However, the similarity of our cortical BMLD measurements to human psychophysical measures suggests that there may be no need for further enhancement by top-down influences.

Effect of anesthesia and recording position
The measurements reported here used the same anesthetics as similar measurements in the IC. However, some anesthetics may have a differential effect at the cortex compared with the IC (Ter-Mikaelian et al., 2007), particularly on the temporal aspects of the responses. They used pentobarbitone and ketamine, which both have effects on specific ion channels (MacDonald et al., 1989;Franks and Lieb, 1994). We used urethane, which has been shown to induce anesthesia by an effect across a range of ion channels rather than specific families (Hara and Harris, 2002). We supplement with hypnorm, a combination of an opioid and a sedative. This regimen is likely to have different effects to pentobarbitone and ketamine, but we are unaware of data that address differential effects of these drugs at IC and cortex. Our measurements do not depend on precise timing of spikes (all fine timing information about ITDs is converted to mean discharge rates at the level of the brainstem), and so, may not be subject to such differential effects of the anesthetic, even if they occur with this regimen.
We did not histologically verify the recording position; however, past experience suggests that these recordings were mostly obtained in layers III and IV of AI. These are the input layers to the cortex. It is therefore possible that a large amount of processing occurs within the cerebral cortex after these recordings.

Population measures
The range of best frequencies observed here is biased by the deliberate selection of the recording sites in the low-frequency part of AI and is comparable with those measured in the IC in similar studies in the same species (Jiang et al., 1997a, b;Palmer et al., 2000). In the IC, the best ITD distribution was highly best frequency-dependent, so the IC data were resampled to have the same best frequency distribution as recorded in cortex. Following this, the best ITD distributions were not significantly different (Fig. 3A). As in the IC, most units were found to be tuned to the contralateral hemifield (Jiang et al., 1997b;McAlpine et al., 2001).

Distribution of P-and N-type thresholds
The distribution of P-and N-type thresholds differed greatly between the stimulus conditions and somewhat from the previous data from the guinea pig IC (Jiang et al., 1997a). In cortex, the greatest differences were between the antiphasic conditions: the N0S condition yielded far more N-type than P-type thresholds (89% N-type), whereas the reverse was true for the NS0 condition (90% P-type). In the IC, the pattern was similar, but the differences were less extreme (53% N-type for N0S and 60% P-type for NS0). The present results are more similar to those observed by Asadollahi et al. (2010) in the IC of the barn owl where responses were measured to stimuli presented at the unit's best or worst ITD . The signal level was chosen to be approximately just above the homophasic threshold in the model (60 dB). The correlogram is plotted as a function of internal delay for auditory filters between 125 and 2000 Hz. The stimulus was first filtered through a gammatone filter bank, transduced by an auditory hair-cell model (Meddis et al., 1990) and then the cross-product between corresponding filters at the two sides calculated. D, E, Center column represents the summary cross-correlation functions obtained by averaging across the 300 Hz-800 Hz frequency range. D, Summary cross-correlations for the column on the left (N0). E, Summary cross-correlations for the right column (N). A-H, Black represents noise alone, red represents homophasic conditions (N0S0 and NS), and green represents antiphasic conditions (N0S and NS0). . Output of a weighted binaural cross-correlation model for four signal levels of each condition. A-D, Average activity level for a population of neurons with a best frequency between 300 and 800 Hz distributed according to their best ITD (along the ordinate). The weighting function applied here reflects the distribution of best noise ITDs shown in Figure 3. Black line indicates the noise alone situation. The three other signal levels were chosen to match those presented in Figure 10 and represent signal levels around the detection threshold of each condition.