Abstract
Motion direction is a crucial cue for predicting future states in natural scenes. In the auditory system, the mechanisms that confer direction selectivity to neurons are not well understood. Neither is it known whether sound motion is encoded independently of stationary sound location. Here we investigated these questions in neurons of the owl's external nucleus of the inferior colliculus, where auditory space is represented in a map. Using a high-density speaker array, we show that the preferred direction and the degree of direction selectivity can be predicted by response adaptation to sounds moving over asymmetric spatial receptive fields. At the population level, we found that preference for sounds moving toward frontal space increased with eccentricity in spatial tuning. This distribution was consistent with larger receptive-field asymmetry in neurons tuned to more peripheral auditory space. A model of suppression based on spatiotemporal summation predicted the observations. Thus, response adaptation and receptive-field shape can explain direction selectivity to acoustic motion and an orderly distribution of preferred direction.
Introduction
Direction selectivity (DS) in the auditory system has been studied in the contexts of frequency modulation (FM) and motion in space. Encoding FM is important for vocal communication and echolocation (Suga and Schlegel, 1973; Mendelson and Cynader, 1985; Rauschecker, 1997; Razak and Fuzessery, 2006). FM is analogous to movement in visual space, as motion occurs over the tonotopic and retinotopic axes of the auditory and visual receptors, respectively. There is mounting evidence that DS for FM sweeps can be explained by asymmetric excitation and inhibition across the tonotopic axis (Casseday et al., 1994; Zhang et al., 2003; Ye et al., 2010; Kuo and Wu, 2012). These mechanisms are consistent with findings in vision where asymmetric circuit structure confers DS in the retina (Briggman et al., 2011; Wei et al., 2011; Vaney et al., 2012).
DS for sound motion is crucial for tracking auditory objects in space. Neural sensitivity to auditory motion direction has been reported in many species (Sovijärvi and Hyvärinen, 1974; Rauschecker and Harris, 1989; Reale and Brugge, 1990; Wagner and Takahashi, 1990; Ahissar et al., 1992; Stumpf et al., 1992; Spitzer and Semple, 1993; Wilson and O'Neill, 1998; Ingham et al., 2001; Malone et al., 2002). Processing spatial motion may require different mechanisms compared with vision, as auditory space is computed, rather than mapped on the receptor's surface. Adaptation has been proposed to underlie DS (Ingham et al., 2001; Malone et al., 2002; Ingham and McAlpine, 2004; Shestopalova et al., 2012), because it provides a mechanism for activation history to affect subsequent response.
Although adaptation is widely reported in the auditory system (Harris and Dallos, 1979; Ingham and McAlpine, 2004; Ulanovsky et al., 2004; Gutfreund and Knudsen, 2006; Ayala and Malmierca, 2013; Singheiser et al., 2012), few studies have directly linked the time course of adaptation with selectivity for sound motion (Ingham et al., 2001; Malone et al., 2002; Ingham and McAlpine, 2004). Simulations based on the summation of suppression (Reid et al., 1991; Jagadeesh et al., 1993; Tolhurst and Heeger, 1997) could explain motion sensitivity in vision, suggesting temporal integration may be important for generating DS. Models of inferior colliculus (IC) neurons showed that adaptation could give rise to sensitivity to dynamic binaural cues observed during sound motion (Cai et al., 1998b).
We studied the relationship between the time course of adaptation and DS in a population of space-specific neurons of the owl's midbrain (Knudsen and Konishi, 1978), using a dense hemispheric speaker array. We found that in fact adaptation could predict DS in single cells based on the properties of the neurons' spatial receptive field (SRF). In addition, we found that systematic changes in receptive field shape across the population could account for a topographic representation of DS overlapping the map of auditory space.
Materials and Methods
Surgery
Adult barn owls of both sexes (3 males, 1 female) were implanted with stainless steel head plates and a reference post, as described previously (Steinberg and Peña, 2011; Wang et al., 2012). A dental acrylic well was built around the craniotomy above external nucleus of the IC (ICx) for repeated sessions in each animal.
Owls were food-deprived 12 h before recording. At each recording session, anesthesia consisting of intramuscular injections of ketamine hydrochloride (20 mg/kg; Ketaset) and xylazine (4 mg/kg; Anased) was administered with prophylactic antibiotics (ampicillin; 20 mg/kg, i.m.) and lactated Ringer's solution (10 ml, s.c.). The depth of anesthesia was monitored by pedal reflex. Additional injections were given to maintain anesthesia during the experiment. Body temperature was maintained throughout the session with a heating pad.
At the end of each session, the craniotomy was sealed with a clear quick-curing silicone compound (Quick-Pro, Warner Tech-Care). An intramuscular injection of carprofen (3 mg/kg, Rimadyl) was given to relieve inflammation and pain. All owls were able to fly the day following recording. Owls were allowed to recuperate in their home cages for 7–10 d before the next session. These procedures comply with guidelines set forth by the National Institutes of Health and by the Albert Einstein College of Medicine's Institute of Animal Studies.
Extracellular recording
Tucker Davis Technologies (TDT) System 3 and custom programs written in Matlab (MathWorks) were used to present all acoustic stimuli and record neural data. All experiments were performed in a double-walled sound-attenuating chamber (Industrial Acoustics) lined with echo-absorbing acoustical foam (Sonex).
ICx was located stereotaxically and by the characteristic responses to interaural time difference (ITD) and interaural level difference (ILD; Moiseff and Konishi, 1981; Takahashi et al., 1984; Peña and Konishi, 2001). Single and multiunit responses were recorded using 1 MΩ tungsten electrodes (AM Systems) advanced in steps of 10 μm to the level of optic tectum, then at steps of 2–4 μm during search in ICx (Motion Controller, model ESP300, Newport).
Acoustic stimuli
Dichotic stimulation.
We used custom-made earphones each consisting of a speaker (Knowles, model 1914) and a microphone (Knowles, model 1319) housed in a cylindrical metal earpiece that fits in the owl's ear canal. Microphones inside the earphones were calibrated before experiments using a Fostex speaker (FE87E) and a reference Brüel and Kjær microphone (model 4190; Steinberg and Peña, 2011; Wang et al., 2012).
Auditory stimuli delivered through the earphones consisted of five repetitions of 100 ms duration broadband signals (0.5–10 kHz) or tones, with a 5 ms rise-fall time at 10–20 dB above threshold. For each ICx neuron, we first measured the ITD, ILD, frequency tuning, and rate-intensity response using dichotic stimulation. ITD varied from ±300 μs, and ILD from ±40 dB, where negative values represent sounds leading and louder on the left ear, respectively. Frequency ranged from 500 Hz to 10 kHz and sound level varied from 0 dB SPL to 70 dB SPL. Five trials of each test were collected. Stimuli within the tested ranges were randomized during data collection.
Earphones were removed for testing in the free-field, after which earphones were replaced and recalibrated before searching for the next unit with dichotic stimulation.
Stationary receptive field mapping.
Free-field spatial tuning of each neuron was measured using a custom-made hemispherical array of 144 speakers (Sennheiser, 3P127A) constructed inside the sound-attenuating chamber (Pérez and Peña, 2006; Wang et al., 2012). The speaker array ranged ± 100° in azimuth and ± 80° in elevation. The angular separation between the speakers varied from 10° to 30°. The highest density of speakers was located in frontal space, at the center of the array (±40° around origin) and on the horizontal and vertical axes passing through the origin (±100° azimuth, ±80° elevation). Each speaker in the array was calibrated using a Brüel and Kjær microphone (model 4190). The calibration apparatus was mounted on a custom-built pan-tilt robot positioned at the center of the array. The robot oriented the microphone toward the speaker being calibrated, with fine adjustments assisted by a laser-mounted webcam. Each speaker's transfer function was then measured using a Golay code technique (Zhou et al., 1992), after which an output voltage rms versus stimulus-intensity (dB SPL) curve was computed and stored (Wang et al., 2012). All acoustic stimuli in free field were presented within the dynamic range of the rate-intensity curve for each unit (typically 30–45 dB SPL).
A linear subset of 21 speakers located at −100° to 100° in azimuth at 0° elevation was used to map the SRF in azimuth. Spatial separation between speakers in this subset was 10°. Broadband (0.5–10 kHz) sound bursts 25 ms in duration were presented at random locations within the 21 speaker subset. Up and down-ramps for each burst were 5 ms and the interstimulus interval was 300 ms. Forty-five to 50 trials were tested for each speaker location. Each unit's preferred direction was designated as the midpoint in the spatial tuning curve main peak at half-maximum. After a unit's spatial tuning was determined, the owl was rotated so the center of the unit's SRF was aligned to 0° azimuth. Stationary SRF mapping was repeated with the owl in the new orientation and subsequent free-field tests were performed in this condition.
Simulated acoustic motion in free field
Moving sound stimuli were presented with the same 21-speaker array used for stationary SRF mapping. Motion was initiated at −100° or 100°. Broadband 25 ms sound bursts were presented in sequence across the array. Onset and offset ramps of adjacent speakers overlapped in time to create a perceptually smooth motion. The duration of each moving stimulus was 425 ms for a 200° displacement, or 470 °/s. Previous studies have used speeds ranging from 125 to 1200 °/s in birds and mammals (Rauschecker and Harris, 1989; Wagner and Takahashi, 1990, 1992; Ingham et al., 2001). Motion from left-to-right (LR) or right-to-left (RL) was randomized over 250 trials with 1 s of silence between trials. Clicks (1 ms duration) were also used for moving stimuli. Motion velocity was controlled by changing the interclick interval (ICI). ICIs of 25 and 250 ms were used to test the effect of motion speed.
Adaptation test.
Each trial consisted of a pair of 1 ms clicks presented at a single speaker location corresponding to the center of the receptive field for each neuron. ICIs were randomized between 50 and 500 ms over 800 trials. The interval between pairs of clicks was 800 ms.
Data analysis
Isolation of single units was validated offline by spike-sorting using Wave_Clus (Quiroga et al., 2004). Stationary azimuthal SRFs were computed by averaging the firing rate in response to each speaker for 100 ms after the onset of stimulus. For SRFs in the moving condition, each moving stimulus was treated as one trial. peristimulus time histograms (PSTHs) were calculated with 5 ms bins. The direction selectivity index (DSI) is described in Equation 1, where FRLR is the firing rate at the center of the SRF for the LR motion direction and FRRL is the firing rate for the RL direction. We defined the center as the five speakers that covered the main peak of the SRF (±20°). Positive and negative values of DSI distinguish preference for the LR and RL directions, respectively. Side peak asymmetry (SPA) was calculated in the same way as DSI using the sum within 40° to 80° at either side of the main peak.
Adaptation in pairs of clicks was quantified by comparing the responses to the first (C1) and second (C2) click. Responses to clicks usually lasted between 5 and 30 ms. Spikes were counted for 30 ms after taking into account the response latency for each cell. Firing rates for the first and second clicks were grouped across the ICI range in 35 ms bins. Each bin contained at least 50 trials. Normality of all datasets was assessed using the Lilliefors test.
Model of spatiotemporal summation
Linear spatiotemporal summation was used to model response adaptation during motion and to predict DS. Excitation elicited by the nth speaker in the moving-sound sequence was scaled by the amplitude of the response in the stationary receptive field (SRFn). Response adaptation, represented as suppression (Sn), recovered over time (t) for the duration of the sound at each position. Decay of adaptation was modeled as an exponential function (Sen et al., 1996; Varela et al., 1997; Oxenham, 2001) drawn from the data, using a least-square fit (Matlab) with 1 ms sampling rate and time constant (τ) for each cell's adaptation curve (Sen et al., 1996; Varela et al., 1997). The starting value of adaptation at the nth speaker depended on suppression carried over from stimuli at previous locations (Sn−1; Eq. 2) such that adaptation accumulates over multiple stimuli when they occurred in quick succession. Adaptation over the course of the moving stimulus in the LR or RL directions was estimated from the stationary SRF for each cell by summing suppression elicited by each speaker from −100° or 100° end of the array.
To predict the SRF during motion [SRFmov(t)], the stationary SRF (SRFstat) was multiplied by the time-dependent suppression function, S(t), in the LR or RL directions (Eq. 3). Because suppression accumulated over time, responses elicited later in time were more strongly suppressed. DSI in the model was calculated in the same way as in the data. The response latency for each neuron was used to temporally align the observed and predicted SRFs.
Results
Data from 111 ICx single units were included in this study. ICx cells were narrowly tuned to elevation and azimuth and showed a wide range of sensitivity to motion.
Receptive-field asymmetry predicts direction selectivity
Azimuthal SRF in the stationary condition were recorded from −100° to 100° with the region of maximal excitation aligned at the center of the array. The center peak was flanked by smaller side peaks of varying size, located 40°–80° to either side. In ICx, spatial tuning in azimuth arises from the selectivity to ITD (Moiseff and Konishi, 1983). Rate-ITD curves in ICx exhibit a larger peak at the characteristic delay and smaller side-peaks as a result of the convergence of ITD channels across frequency (Takahashi and Konishi, 1986; Mori, 1997; Mazer, 1998).
The cells' responses differed between stationary and moving stimulus conditions. Figure 1 shows three examples of responses under stationary and moving conditions. Cells in Figure 1A,B were sensitive to motion direction. Specifically, the cell in Figure 1A preferred the RL direction, when the response at the center was preceded by the smaller side peak. Motion in the LR direction elicited a larger response before arriving at the center, which had a strong attenuating effect. The same was true for the example in Figure 1B, although in the opposite direction; the right side peak was larger and the LR direction was preferred by this cell. In Figure 1C, side peak sizes were similar and the cell was only weakly selective for motion direction at center. In all three examples, the response amplitude at side-peaks in the moving condition was similar to that in the stationary SRF when they preceded center during motion. However, responses at the side peaks were significantly attenuated when they followed response at the center. DS was also tested for moving stimuli in a subset of the speaker array corresponding to only the main peak of the SRF (±20°), thus excluding side peaks. DS was weak when there was no stimulation at the side peaks, demonstrating their strong contribution to DS (Fig. 1, insets). The attenuating effects of the cell's firing on future responses indicated that the shape of the SRFs could significantly influence how neurons respond to moving sounds, in a manner consistent with adaptation.
The relationship between SRF asymmetry and DS was demonstrated for the entire dataset in Figure 2. In the LR motion direction, the left side peak preceded the center, whereas the right side peak preceded the center in the RL direction. The response at the center was proportional to the difference between the side peaks preceding the center in either motion direction, referred to as side peak asymmetry under the moving condition (SPAm). This relationship is plotted in Figure 2A, which shows that when the left side peak was larger (SPAm > 0), RL was the preferred motion direction (DSI < 0). The converse was true for cells with larger right side peaks, which showed a preference for LR motion direction (DSI > 0). Asymmetry between left and right side peaks was also quantified for the stationary SRF, referred to as side peak asymmetry under stationary conditions (SPAs), to disambiguate from the moving condition. DSI was also significantly correlated with SPAs (Fig. 2B), further indicating that the response to side peaks was a robust predictor of DSI. The ratio of side peak response magnitude after and before stimulation at center was 0.23 (median, interquartile range 0.43, Wilcoxon rank sum test p < 10−18) reflecting the attenuating effect of response at center on the subsequent side peak. The ratio between DSIs measured using the subset and full arrays was 0.35 ± 0.05 (n = 24; t test, p < 10−7), indicating DS was more robust when side peaks were stimulated. Thus, the activation history was closely correlated with DS.
Topography of direction selectivity
Across the population, ICx cells exhibited a wide range of DS and side peak asymmetry (Fig. 2). We thus looked into whether this variability could be explained by where neurons were located in the map of auditory space. In other words, we examined whether DS and SRF asymmetry were correlated with the neurons' spatial tuning, which itself is tied to the neurons' location in the map (Knudsen and Konishi, 1978). Seventy-two of 111 cells showed a larger ipsilateral side peak. There was a weak but significant correlation between side-peak asymmetry and tuning eccentricity such that more peripherally tuned cells showed larger side-peak asymmetry (Fig. 3A). Grouped analysis by spatial tuning at intervals of 10° revealed a systematic reduction in contralateral side-peak height with increasing eccentricity in spatial tuning (Fig. 3B).
There was a weak but significant trend for laterally tuned neurons to show stronger preference for sounds moving toward the front, indicated by the higher number of neurons with negative DSIs (Fig. 3C). Although correlations between SPAs and spatial tuning (Fig. 3A) and between DSI and spatial tuning (Fig. 3C) were low, they were nonetheless both statistically significant and consistent with one another. These observations suggest there is a topographic distribution of DS overlapped with the map of auditory space in ICx, where preference for sounds moving from the periphery to the front increases with tuning eccentricity (Fig. 3D). The most lateral spatial tuning in the recorded sample was 45°, reflecting a known difficulty in accessing the most lateral areas of the ICx map (Knudsen and Konishi, 1978; Wagner et al., 2007). However, because there is an overrepresentation of frontal space, the sampled area covered most of the nucleus (Fig. 3D; Knudsen and Konishi, 1978).
Adaptation as a mechanism for direction selectivity
Because adaptation appeared as a plausible mechanism underlying DS, we used paired-click stimulation to test whether the temporal dynamics of response adaptation could explain DS in 44 ICx neurons. Figure 4A shows an example response to a click (C2) presented at different delays after a preceding click (C1; Fitzpatrick et al., 1995; Brosch and Schreiner, 1997; Wehr and Zador, 2005; Gutfreund and Knudsen, 2006; Singheiser et al., 2012). The response suppression to C2 was strongest when ICIs were <250 ms. The averaged time course for adaptation could be described by an exponential function with a time constant (τ) of 98 ms (Fig. 4B). Double-exponential fits (Yin, 1994; Ulanovsky et al., 2004; Singheiser et al., 2012) to the averaged adaptation data were not distinguishable from single exponential functions. The two types of exponential functions yielded highly correlated predictions (R2 = 0.86). For simplicity, we used a single exponential function to describe the temporal properties of adaptation with one variable, τ.
If adaptation underlies DS, then the response to motion speed should be explained by the time course of adaptation. We then measured DS using clicks spaced by short intervals (25 ms) when adaptation was strong and long intervals (250 ms) when adaptation was weak. Stimulation with clicks elicited weaker direction selectivity than with overlapping sound bursts (Fig. 4C). This can be explained by the effect of gaps between sounds (Wagner and Takahashi, 1992), which may allow for recovery from response suppression. Nevertheless, DS was stronger for stimuli with short ICIs. The effects of interclick interval on DS were consistent with the temporal properties of response adaptation, where suppression was strongest for ICIs <250 ms (Fig. 4A).
Temporal summation of suppression predicts direction selectivity
We constructed a model of response adaptation such that suppression was summed during the motion trajectory. This model is based on short-term plasticity lasting several hundred milliseconds (Varela et al., 1997). We used a sample of 44 cells to test whether the model could predict the directionality from their decay time constant τ. Because the moving sound could be decomposed into stationary sounds at each speaker in the array (Reid et al., 1991; Jagadeesh et al., 1993, 1997), the model treated the stimulus as a sequence of 21 (number of speakers) consecutive stimuli that are contiguous in space. Each sound that excited the cell triggered an amount of suppression proportional to the response at that location. The decay in suppression after the onset of sound was defined by τ. Short intervals between stimuli resulted in accumulated suppression over time because there was not enough time for suppression to decay to zero (Fig. 5A). Figure 5B shows the response to stationary stimuli of the same cell as Figure 1A, where the cumulative suppression for LR (blue arrow) and RL (red arrow) motion directions were modeled as time-dependent scaling factors or S(t) (see Materials and Methods). Motion starts at either end of the array where suppression elicited by each sound causes large downward deflections in the scaling factor (Fig. 5B, bottom). The magnitude of the suppression is proportional to the response magnitude in the stationary SRF (SRFstat; Fig. 5B, top). We scaled the SRFstat by the time-dependent cumulative suppression S(t) in each direction to estimate the SRF during motion (SRFmov). This transformation of the SRFstat gave a close prediction of SRFmov for both LR and RL directions (Fig. 5C,D).
To test the effects of suppression decay time on DS, we compared the model's prediction using different time constants. The distribution of τ from adaptation is shown in Figure 6A. Each cell's individual adaptation τ generated predictions for direction selectivity index that matched the data well (Fig. 6B). When a short τ (25 ms) was used to predict DS for all cells, the model underestimated DS (Fig. 6C), because fast-decaying response suppression did not accumulate over time. Large τ values produced good predictions of DSI (Fig. 6D). This lack of overestimation was likely due to the short interstimulus intervals (25 ms) relative to the decay time constants. Differences in suppression decay were more prominent among fast-decaying functions as demonstrated by the underestimation of DS with short τ. Thus, linear spatiotemporal summation of adaptation was sufficient to produce the observed DS in ICx neurons.
Discussion
This study builds upon existing evidence to establish a link between response adaptation and the emergence of DS in the auditory system (Cai et al., 1998b; Malone et al., 2002; Ingham and McAlpine, 2004). A model based on spatiotemporal summation of response adaptation could predict each cell's DS using their individual adaptation time constant. On the population level, preference for sounds entering the front increased with eccentricity. This was consistent with increased SRF asymmetry for peripherally tuned cells. Our findings suggest that sensitivity to motion may be a general property for spatially tuned cells that adapt.
Comparisons with previous studies
In this study, we show that each cell's DS depended on its SRF asymmetry, which in turn was correlated with the cell's tuning in space. Parameters, such as speaker density and angular distance, determine stimulation in the surround, thus affecting DS at the center. These parameters must be considered when interpreting past work on DS in the owl IC (Wagner and Takahashi, 1990, 1992).
The population trend for DS in ICx is in agreement with fMRI studies showing hemispheric preference for auditory motion in the contra-to-ipsilateral direction in humans (Getzmann, 2011). Our results are, however, in contrast with the preference for outward sounds reported by Wagner and von Campenhausen (2002). We explain this inconsistency by methodological differences between studies. The coarser array (30° between speakers) used in the previous study may not be sufficient to map and stimulate regions of the SRF that surround the excitatory center. Further, DS trends reported in Wagner and von Campenhausen (2002) were derived from a pooled population of several midbrain nuclei and across all elevations whereas we restricted our analysis to only ICx cells responsive at 0° elevation.
In the context of DS, the adaptation time course could determine the sensitive range of motion velocities. Using clicks, we showed that DS decreased for long interstimulus intervals over the same spatial displacement. This is in agreement with Wagner and Takahashi (1992), where DS was strong for motion velocities faster than 300°/s and weaker for slower velocities.
Emergence of SRF asymmetry
The correlation between motion selectivity and SRF shape raises the question of how side peak asymmetry emerges. Side peaks result from ITD computation by coincidence detector neurons in the brainstem (Carr and Konishi, 1990; Yin and Chan, 1990). Early in the ITD pathway, cells narrowly tuned to frequency respond to ITD ambiguously. This ambiguity is resolved in ICx where frequency bands converge. The main peak in the ITD tuning of ICx neurons corresponds to the characteristic delay of the neurons (Rose et al., 1967; Takahashi and Konishi, 1986). Responses at phase-equivalent ITDs are attenuated but not completely eliminated, thus forming side peaks (Takahashi and Konishi, 1986; Mazer, 1998; Peña and Konishi, 2000). It is possible for asymmetric side peaks to emerge through skewed alignment of ITD tuning across frequency (McAlpine et al., 1998). Alternatively, the space-dependent filtering properties of the owl's head and facial ruff (Payne, 1971; Knudsen and Konishi, 1979; Coles and Guppy, 1988) known as the head-related transfer function (HRTF) could contribute to SRF asymmetry. Because HRTFs show increased gain for sounds directly in front of the owl's face (Keller et al., 1998), side peaks closer to frontal space would be amplified by louder sound at the eardrums.
Adaptation mechanism
The adaptation reported here lasts several hundred milliseconds. We interpreted this time course as the lower limit for suppression decay rate to produce the observed DS, because very long τ also generated good predictions for DS. The observed adaptation time course is longer compared with in the auditory nerve (Harris and Dallos, 1979; Delgutte, 1990) and in the thalamus (Wehr and Zador, 2005) but similar to data using paired-stimulation in the IC (McAlpine et al., 2000; Ingham and McAlpine, 2004; Gutfreund and Knudsen, 2006; Singheiser et al., 2012), optic tectum (superior colliculus in mammals; Netser et al., 2011) and auditory cortex (Brosch and Schreiner, 1997; Wehr and Zador, 2005; Nelson et al., 2009; Lanting et al., 2013). Although these studies used similar paired-stimulation protocols and reported comparable recovery time scales, the phenomena studied may be either adaptation or forward suppression.
The commonly made distinction between forward suppression and adaptation is that forward suppression does not rely on the firing rate of the masker (Calford and Semple, 1995; Malone and Semple, 2001; Nelson et al., 2009), although this does not necessitate the mechanisms underlying these phenomena are mutually exclusive (Relkin and Turner, 1988; Oxenham, 2001). However, it is important to discern whether response suppression is dependent on the masker stimulus itself or on the response elicited by the masker (Sanes et al., 1998; Malone and Semple, 2001; Malone et al., 2002; Bartlett and Wang, 2005). Gutfreund and Knudsen (2006) showed the intensity of the masking stimulus in ICx is proportional to adaptation strength. Our data were consistent with this finding, as the difference in excitation at the side peaks predicted DS. However, we did not find a relationship between response history and adaptation strength on a trial-to-trial basis. This suggests adaptation may not rely on intrinsic spiking mechanisms (Priebe et al., 2002). It is possible that adaptation is relayed to ICx from upstream nuclei where similar properties have been observed (Singheiser et al., 2012). However, novel adaptation properties emerge within ICx, such as adaptation across frequency (Gutfreund and Knudsen, 2006).
Several mechanisms may mediate the long-lasting suppression (hundreds of milliseconds to seconds) observed in adaptation and forward suppression in the auditory system (Ulanovsky et al., 2004; Nelson et al., 2009). Although GABAergic inhibition plays a role in processing binaural cues (Fujita and Konishi, 1991; Sanes et al., 1998; Fukui et al., 2010), direction selectivity (Kautz and Wagner, 1998; Razak and Fuzessery, 2009) and stimulus-specific adaptation (Pérez-Gonzalez et al., 2012), modeling (Cai et al., 1998a,b), and intracellular recordings (Wehr and Zador, 2005) suggest the time course of inhibition is too short to account for the recovery time of adaptation. Further, spike-frequency adaptation (Ingham and McAlpine, 2004) and motion sensitivity in the IC (McAlpine and Palmer, 2002) are not eliminated by blocking GABAergic input. Alternatively, synaptic depression and afterhyperpolarization may contribute to the time course of adaptation.
Validity of the model
Our model assumes response adaptation sums linearly in time, resulting in direction selectivity if asymmetry is present. Linear summation of suppression is supported by psychophysics (Plack et al., 2006) and several studies on DS in vision (Enroth-Cugell et al., 1983; Jagadeesh et al., 1993, 1997). Similar to the relationship between DS and SRF asymmetries shown in the present study, the shape of spatiotemporal receptive fields in simple cells can predict the preferred motion direction (DeAngelis et al., 1993). Linear summation models can underestimate DS (Reid et al., 1991; DeAngelis et al., 1993; Oxenham, 2001) depending on the type of functions used to model the time-dependent suppression decay (Drew and Abbott, 2006).
Effect of anesthesia
In adequate dosages, ketamine anesthesia in birds does not cause profound changes in respiration and cardiovascular function (Degernes et al., 1988). Recovery time from response suppression obtained under ketamine-xylazine anesthesia is comparable with those obtained using other types of anesthesia, such as halothane and nitrous oxide (Gutfreund and Knudsen, 2006), ketamine and medetomidine (Wehr and Zador, 2005), and no anesthesia (Nelson et al., 2009). The administration of barbiturates can prolong recovery time from suppression (Wehr and Zador, 2005), whereas ketamine only mildly affects temporal properties in the IC (Ter-Mikaelian et al., 2007).
Functional implications
Direction selectivity can emerge by lateral connections within sensory maps. Adaptation, on the other hand, could give rise to direction selectivity in single cells. This mechanism could explain direction selectivity in auditory regions where space is not represented topographically (Rauschecker and Harris, 1989; Wilson and O'Neill, 1998; McAlpine et al., 2000; Ingham et al., 2001; Malone et al., 2002). Receptive fields in the visual (DeAngelis et al., 1995; Touryan et al., 2005) and auditory systems (Jenison et al., 2001) can take complex shapes. Thus, the asymmetry required for individual cells to become directional through adaptation may be a common feature in sensory systems.
Footnotes
This work was supported by National Institutes of Health Grants F31 DC012000 to Y.W., and R01 DC007690 to J.L.P., and by a grant from the US-Israel Binational Science Foundation to J.L.P. We thank Michael Beckert, Fanny Cazettes, Gervasio Batista, and Brian Fischer for feedback and comments on the paper.
The authors declare no competing financial interests.
- Correspondence should be addressed to Yunyan Wang, Rose F. Kennedy Center 529, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461. yunyan.wang{at}phd.einstein.yu.edu