Abstract
Understanding the relationship between the auditory selectivity of neurons and their contribution to perception is critical to the design of effective auditory brain prosthetics. These prosthetics seek to mimic natural activity patterns to achieve desired perceptual outcomes. We measured the contribution of inferior colliculus (IC) sites to perception using combined recording and electrical stimulation. Monkeys performed a frequency-based discrimination task, reporting whether a probe sound was higher or lower in frequency than a reference sound. Stimulation pulses were paired with the probe sound on 50% of trials (0.5–80 μA, 100–300 Hz, n = 172 IC locations in 3 rhesus monkeys). Electrical stimulation tended to bias the animals' judgments in a fashion that was coarsely but significantly correlated with the best frequency of the stimulation site compared with the reference frequency used in the task. Although there was considerable variability in the effects of stimulation (including impairments in performance and shifts in performance away from the direction predicted based on the site's response properties), the results indicate that stimulation of the IC can evoke percepts correlated with the frequency-tuning properties of the IC. Consistent with the implications of recent human studies, the main avenue for improvement for the auditory midbrain implant suggested by our findings is to increase the number and spatial extent of electrodes, to increase the size of the region that can be electrically activated, and to provide a greater range of evoked percepts.
SIGNIFICANCE STATEMENT Patients with hearing loss stemming from causes that interrupt the auditory pathway after the cochlea need a brain prosthetic to restore hearing. Recently, prosthetic stimulation in the human inferior colliculus (IC) was evaluated in a clinical trial. Thus far, speech understanding was limited for the subjects and this limitation is thought to be partly due to challenges in harnessing the sound frequency representation in the IC. Here, we tested the effects of IC stimulation in monkeys trained to report the sound frequencies they heard. Our results indicate that the IC can be used to introduce a range of frequency percepts and suggest that placement of a greater number of electrode contacts may improve the effectiveness of such implants.
Introduction
In this study, we investigated the percepts evoked by electrical stimulation in the inferior colliculus (IC), a recently introduced candidate site for prosthetics (Lenarz et al., 2006b; Lenarz et al., 2007; Lim et al., 2007; Samii et al., 2007; Lim et al., 2008a; Lim et al., 2009; Calixto et al., 2013; Lim et al., 2013; Lim and Lenarz, 2015). IC implants have been proposed and tested as a means of restoring hearing in patients lacking an auditory nerve (and thus unable to benefit from cochlear implants; Lim et al., 2009). The IC offers several advantages, such as surgical accessibility and a relatively early position in the auditory pathway compared with the auditory thalamic or cortical areas (Dobelle et al., 1973; Rousche et al., 2003; Otto et al., 2005a, 2005b; Lenarz et al., 2006b; Samii et al., 2007; Atencio et al., 2014) and it may therefore permit insertion of signals at a stage involving less complex or abstract processing of auditory information. Patients in a small (n = 5) trial experienced a variety of auditory percepts and improved speech perception when assisted with lip-reading cues, but the results for open-set speech perception were poor (Lim et al., 2008a; Lim et al., 2009; Lim and Lenarz, 2015).
These results provide proof of concept, but suggest that there may be previously unappreciated challenges associated with matching the stimulation paradigm to the IC's functional architecture (Lim et al., 2009; Bulkin and Groh, 2011). In particular, only ∼17% of the IC in monkeys is tonotopically organized; the bulk of the remainder responds preferentially to low-frequency sounds (<1 kHz; Ryan and Miller, 1978; Bulkin and Groh, 2011). A similar low proportion of tonotopic penetrations (18%) was reported in a human study using a comparable analysis of fMRI data (Ress and Chandrasekaran, 2013), which suggests that, in the human and rhesus monkey, the IC may be more spatially restricted than in other species (Rose et al., 1963; Geniec and Morest, 1971; Clopton and Winfield, 1973; Merzenich and Reid, 1974; Aitkin and Moore, 1975; FitzPatrick, 1975; Knudsen and Konishi, 1978; Moore et al., 1983; Stiebler and Ehret, 1985; Poon et al., 1990; Kelly et al., 1991; Casseday and Covey, 1992; Malmierca et al., 1995; Miller et al., 2005; Bulkin and Groh, 2011; De Martino et al., 2013; Ress and Chandrasekaran, 2013). In addition, neurons in the monkey IC are sufficiently broadly tuned that >30% of IC sites show responses to sounds in the range of 400–3000 Hz at a modest sound level of 50 dB SPL (Bulkin and Groh, 2011). Such sparse tonotopy and coarse tuning could provide an explanation for the challenges in achieving consistent success with human auditory midbrain implants.
Accordingly, we sought to test the effects of stimulation in the monkey IC, focusing on how stimulation-induced alterations in auditory perception correlated with the best frequency (BF) of the neurons at the electrode site. We found a statistically significant but coarse correlation between the perceptual effects of stimulation and the BF of the neurons at the stimulation site. Consistent with proposals for a second clinical trial (Lim and Lenarz, 2015), the findings suggest that a large number of electrode sites may be beneficial, either because this would permit selecting a subset of the most effective sites for use or because costimulation could be applied to multiple sites to better mimic the natural auditory response properties observed in the IC. Finally, the development of an animal model closely related to humans in which both physiological and perceptual measurements can be made should increase the pace of progress in this important field. This work is the next logical step to build on a foundation of related studies in cats and rodents demonstrating the effects of stimulation in auditory brain areas using behavioral tasks (Gerken, 1970; Gerken et al., 1985; Gerken et al., 1991; Rousche et al., 2003; Otto et al., 2005a, b; Znamenskiy and Zador, 2013; Tsunada et al., 2016) or physiological measures at stages either after the IC (i.e., auditory cortex; Lenarz et al., 2006a; Neuheiser et al., 2010; Calixto et al., 2012; Atencio et al., 2014) or before it (i.e., the cochlea; Mulders and Robertson, 2000, 2006; Zhang and Dolan, 2006).
Portions of this work were previously presented in abstract form (Ross and Groh, 2009, 2010; Pages et al., 2012).
Materials and Methods
Subjects.
Subjects were three female rhesus macaques (monkeys M, V, and J; ages 3–16 years, weighing 2.8–9.3 kg. Animal M had prior experience with related physiological experiments (Bulkin and Groh, 2011). Subjects were seated in primate chairs (Crist Instruments) and comfortably head restrained using a standard head post system (Crist Instruments). Animal procedures were conducted in accordance with the principles of the Guide for Care and Use of Laboratory Animals of the National Institutes of Health (eighth edition, revised 2011) and involved standard operative and postoperative care, including the use of anesthetics and analgesics for all surgical procedures. All animal procedures were approved by the Duke University Institutional Animal Care and Use Committee.
Targeting of the IC.
Recording chambers were oriented toward the IC based on stereotaxic coordinates (Paxinos) using stereotaxic surgical instruments (Kopf). After the animals recovered from surgery, structural MRI images of the animals' brains were collected to verify the position of the recording cylinder with respect to the IC, which could easily be visualized and distinguished from the surrounding tissue on the MRI (see Fig. 3A–C; Groh et al., 2001; Bulkin and Groh, 2011).
Recordings were conducted using moveable electrodes (impedance 0.09–3.29 MΩ; FHC), with a new electrode used every few sessions. Electrodes were positioned using a computer-controlled microdrive (NAN Instruments) that allowed recording from a different location for each session and across a wide range of sites in the IC. Electrodes approached the IC at an angle (∼20 degrees from vertical in the coronal plane) designed to traverse the tonotopic gradient in the IC in which lower BFs are shallow and higher best-frequency sites are deeper. Such tonotopy has been reported in a subset of more central IC locations in the monkey (Ryan and Miller, 1978; Zwiers et al., 2004) and occurred in ∼17% of penetrations that entered the IC in a previous study from our group (Bulkin and Groh, 2011). We identified the tonotopic region in each of the monkeys used in this study. Because only the tonotopic region contains heterogeneity of BFs, we focused our data collection on this area, but it is likely that some sites were drawn from the surrounding low-frequency-tuned region. Examples of tonotopic penetrations are shown in Figure 3, D and E.
For some sites, we also measured eye movements via scleral search coil (Judge et al., 1980) to verify that we were not recording from the nearby superior colliculus (SC), which exhibits auditory (Jay and Sparks, 1984, 1987; Lee and Groh, 2012; Lee and Groh, 2014) as well as robust saccade-related activity (Wurtz and Goldberg, 1972; Sparks, 1978). Eye movement and position-related signals are also observed in the IC but are not as prominent in the IC compared with the SC (Groh et al., 2001; Porter et al., 2006; Bulkin and Groh, 2012). Eye movements were typically not monitored during the behavioral task.
Finally, response latency measures were consistent with the bulk of the recordings deriving from the IC's central nucleus. We assessed response latency from peristimulus time histograms for each site and defined response onset as the time at which the activity level exceeded 3 SDs above the mean bin height during the baseline period for 2 consecutive 3 ms bins, similar to a method we have used previously (Maier and Groh, 2010). The end of the first bin was defined as the response onset time. The mean latency of the auditory responses of the stimulation sites was 15.7 ms, consistent with the 15.3 ms latency reported by Ryan and Miller (1978) for the central nucleus and shorter than the 25.1 ms reported by those investigators for the surrounding shell region.
Behavioral task and stimuli.
During recording and stimulation phases of the experiment, animals performed a behavioral task requiring comparison of the frequencies of either tones (N = 385/424 included experiments, defined further below under “Response properties and inclusion criteria”) or band-pass noise of different center frequencies (N = 39/424; bandwidth ± 0.26 octaves). Stimuli were delivered through a midrange speaker (Cambridge Soundworks) 1.1 m in front of the animal's head. Each sound was individually calibrated (to 45–65 dB SPL, a-weighted) with a tolerance of ±1 dB SPL. Behavioral responses were recorded via a custom-built touch sensor; correct trials were reinforced via fluid rewards delivered through a tube positioned in front of the animal's mouth. For most experiments (n = 386/424), all stimuli were 55 dB SPL. For other experiments (n = 38/424), we randomly interleaved 45, 55, and 65 dB SPL stimuli.
The design of the frequency discrimination task is shown in Figure 1. Two aspects of the task were essential for our purposes. First, the monkeys needed to discriminate the direction of frequency differences rather than simply detecting a change in frequency. We assume that pairing stimulation in the IC with the presentation of a sound changes the perceptual attributes of that sound in multiple ways (e.g., frequency, timbre, loudness, location). A directional frequency-discrimination task permits isolating the frequency effects; the other attributes likely add variability to performance (e.g., see Fig. 2D), but should not create a correlation between stimulation-induced changes in perception and the BF of the stimulation site.
Frequency discrimination task. Animals initiated trials by grasping a touch sensor. They then heard a series of “reference” sounds (shown as black dots) followed by “probe” sounds (red dots) interleaved with reference sounds. They reported whether the probe was higher (A) or lower (B) in frequency than the reference sounds. Decisions were indicated by either releasing (“go now”) or continuing to hold the touch sensor during the presentation of the probe sounds; in the latter case, they had to release the lever at the conclusion of the probe sounds when a broad band noise was delivered (gray bar; “go later”). Correct trials were reinforced with a liquid (typically juice) reward. During experiments that included electrical stimulation, the stimulation was applied on 50% of the trials and was delivered during the probe stimuli.
Another essential attribute is that the reference frequency (RF) had to be changeable. By roving the RF, we could ensure that the monkey was making a true comparison between the probe and RFs rather than memorizing a response matched to each probe frequency. This permitted us to assess how the effects of stimulation depended on the BF of various stimulation sites in relation to different points of reference in frequency space.
The task was a go now/go later design, ensuring that a response was required on every trial. Animals initiated each trial by grasping the touch sensor. After five repeats of a 150 ms “reference” frequency sound, up to three repeats of a “probe” frequency were presented, interleaved with two additional repeats of the RF (150 ms interstimulus interval). The repeating nature of the probe and RFs was modeled after a frequency discrimination task used successfully in rats by Han et al. (2007). Our design diminished the need to rely on memory for task performance (Bigelow et al., 2014) and provided flexible control over the durations of the different phases of the task.
Two monkeys (V and J) were trained to “go now,” meaning release the touch sensor, if the probe sounds were higher frequency than the RFs and to hold the touch sensor until after the discrimination epoch was complete if the reverse was true (“go later”). One monkey (M) was trained with the opposite contingencies. Training animals in different sets of contingencies served to ensure that observed effects of stimulation were not response-motion specific. If the animal correctly completed the discrimination phase of a “go later” trial, it received a white noise cue (uncorrelated noise, 45–65 dB SPL) indicating that it could release the touch sensor to harvest the reward it had earned by performing correctly during the discrimination phase. Therefore, all trials culminated in the release of the touch sensor, preventing monkeys from adopting a strategy of grasping the sensor throughout the experimental session and thereby collecting rewards on 50% of the trials.
This task was used for both characterizing response properties of the multiunit activity at each stimulation site and conducting the stimulation experiment. For a subset of sites, we also collected data in response to other sounds while the monkeys were passive/resting (there was no task) to verify that the general physiological characteristics of the sites (e.g., tonotopy; see Fig. 3D,E) were broadly consistent with previously published results in the primate IC.
Training on the task was accomplished using operant conditioning and took 6 months or more to achieve reliable performance for near-threshold discriminations at multiple RFs.
During the stimulation phase of the experiment, stimulation was applied on 50% of the trials (interleaved pseudorandomly). Stimulation was delivered at the same time as the probe sound pips (150 ms train duration). Therefore, the stimulation trials included both acoustic stimulation and electrical stimulation. The animals were rewarded for responding on the basis of the actual sounds rather than stimulation; any effects of stimulation on perception tended to reduce the number of rewards that the animals received.
Recording.
Neural signals were amplified, filtered, thresholded, and sorted using a standard system (Plexon) and the spike times were relayed to a NIDAQ card (National Instruments) on a desktop computer. The desktop computer ran software (Beethoven; Ryklin Software) to control the stimuli, monitor the progress of the trials, and save the data to a hard disk.
Response properties and inclusion criteria.
We determined the BF of each site by fitting a Gaussian function (MATLAB software; The MathWorks) to the spike counts observed in a 140 ms window (10–150 ms after sound onset) after either reference or probe sounds (tones or band-pass noise of different center frequencies) delivered during the performance of the frequency discrimination task. BF was defined as the tested frequency nearest the peak of that Gaussian function. Plots of neural responses versus sound frequency can be found in Figures 4, A and C, and 5, A and C. Sites were excluded from further analysis if the Gaussian fit was not significant (p > 0.05), accounted for <20% of the variance in firing rate (r2 < 0.20), or if the tuning was so broad that BF was not an appropriate metric of the response pattern (largest 25% of Gaussian σ values). These frequency-tuning curve inclusion criteria resulted in the inclusion of data from 172 (n = 16 in animal M, n = 148 in animal V, n = 8 in animal J) of 333 recording/stimulation sites in the dataset.
The testing involving a given recording/stimulation site is referred to as a “session.” For some sessions, the effects of stimulation compared with multiple RFs were tested. We refer to the data collected in regard to each of these RFs as an “experiment.” A total of N = 424 stimulation experiments form the dataset (n = 32 in animal M, n = 384 in animal V, and n = 8 in animal J). For some sessions, only one reference was used (n = 60 sessions). When there was more than one RF per session (n = 51 sessions and 364 references), they could be either in separate blocks (n = 111experiments) or randomly interleaved (n = 253 experiments). For some RFs, only two probe frequencies were used (n = 74 experiments), typically in sessions with multiple interleaved RFs.
We strove to test a variety of relationships between recording site characteristics and task parameters. Sometimes, the RF was chosen to be above the BF of the multiunit activity (n = 289 experiments) and, in other instances, it was below the BF (n = 131 experiments). Very rarely (n = 4 experiments), the RF was selected to be equal to the BF. RFs ranged from a minimum of 400 Hz to a maximum of 4000 Hz. Some RFs (e.g., 827 Hz) were used more often than others to facilitate comparisons involving the same reference conditions (see Fig. 7).
Typically, the probe frequencies were selected to be near the RF (see Figs. 4, 5). In some experiments, probe frequencies were chosen by the experimenter, with a total range across all experiments of 3.23 octaves below the reference to 2.75 octaves above the reference. In other experiments, they were positioned symmetrically above and below the RF with an increased representation close to the reference and a total range of one octave. Across all experiments, the maximum probe frequency was 5680 Hz and the minimum probe frequency was 263 Hz.
Stimulation parameters.
Stimulation consisted of trains of biphasic, charge-balanced, cathodic-leading pulses delivered through the same electrodes that were used for recording. A Grass S88 dual output square pulse stimulator was used to control the stimulation pulses in a constant-current mode. The leading cathodic phase of the pulse pairs was 0.2 ms and up to 80 μA (or up to 16 nC per phase). We explored stimulation parameters over the course of the study. During early experiments, we tended to choose lower currents (20 μA or less) and, during later experiments, we tended to choose stronger currents (60–80 μA). The effects as a function of current intensity are shown in Figure 8.
The trailing anodic phase always matched the cathodic phase in charge. It was equal in duration and amplitude to the cathodic phase in most experiments (n = 390). In other experiments (n = 34), it lasted 10 times longer but had one-tenth the amplitude of the cathodic phase (i.e., 2 ms duration, up to 80 μA; McIntyre and Grill, 2002); this isolated the effect of the cathodic phase while still keeping the phases charge balanced (McIntyre and Grill, 2002).
For most experiments, the pulses were delivered at 200 Hz (n = 418). However, for a few experiments, different frequencies (100–300 Hz) of electrical stimulation were used (n = 6). In all experiments, the onset and offset of the stimulation trains coincided with the onset and offset of the sounds (duration = 150 ms).
Analysis of stimulation effects.
We quantified the magnitude of perceptual bias introduced by stimulation (“stim”) as the change in the proportion of correct choices in trials where the correct answer was “high” or “low.” This was then compared with control (“nostim”) trials. A difference between performance in stimulation and control trials indicates that the simulation has caused the animal to report different perception. The equation for this analysis was as follows:
A positive value of this stimulation effect index (SEI) reflects a greater proportion of “high” choices on stimulated than on nonstimulated trials; a negative value indicates the opposite. These changes are analogous to the additions of “signal” described previously in the visual system (Salzman et al., 1992). Each experiment was classified as significant or not significant based on a Fisher's exact test. The four fields in the exact test were high choice stim, low choice stim, high choice no stim, and low choice no stim. We used a threshold of p < 0.05 for these single-site tests. When using the SEI for population analyses, we used a more conservative significance threshold (p < 0.01) because we performed various exploratory population analyses before the main results reported here.
To quantify nonspecific interferences in performance, analogous to additions of “noise” described previously (Salzman et al., 1992), we fit sigmoidal psychometric functions to the behavioral data separately for stimulated and nonstimulated trials. From those functions, we computed the discrimination threshold, or frequency difference associated with a shift from 25% to 75% high choices. We then computed the difference between thresholds on stimulated and nonstimulated trials, as a percentage of the RF for that session as follows:
A positive value of this threshold shift index indicates that the monkey performed worse on the stimulated trials overall than on the nonstimulated trials. This index could only be computed for sites tested with no more than two probe frequencies, so only sites tested with more than two probe frequencies were included in this analysis.
We also characterized sites using logistic regression analysis (Salzman et al., 1992). However, some of the most powerful stimulation effects that we observed were not well modeled by logistic functions. Therefore, aside from using them for illustrative purposes in Figures 2, 4, and 5, we did not use results of the logistic fits herein.
Results
Behavior
We first verified that the monkeys were making judgments about probe frequencies in relation to RFs. Figure 2, A–C, shows examples of performance of each animal on the task during training (Fig. 2C) or testing (Fig. 2A,B). The positive slope of each curve indicates that perceptual reports were correlated with probe frequency, demonstrating that the animals were performing the task as intended. The shifts in the curves for each RF indicate that the monkeys were successfully making a comparison to the RF rather than memorizing a particular stimulus–response association. The two monkeys trained to ignore probe sound level were largely successful in doing so (Fig. 2D).
Example psychometric curves. Individual dots indicate the percentage “higher” judgments for a given probe frequency; each color (A–C) or color family (D) represents a different RF. Positive slopes of logistic regressions indicate a greater proportion of “higher” choices for higher frequencies for a given RF; the shifts in the curves for the different RFs indicate that the monkeys were judging probe frequency relative to the corresponding RF. Trials came from testing sessions in which electrical stimulation was delivered in other trials, which are excluded from analysis in this figure (A, B, D), or from training trials in which electrical stimulation was not delivered at all during the session (C). The sound level of the RF was 55 dB SPL. Probe sound levels could be 45, 55, or 65 dB SPL for monkeys V and J; these monkeys successfully ignored this jitter in probe sound level and made judgements based primarily on probe frequency (D).
IC location and frequency tuning
Stimulation at sites with a range of BFs is thought to be critical for the success of IC implants, particularly given the broad range of spectral content in speech sounds. We targeted primarily the tonotopic region of the IC because previous mapping studies (Ryan and Miller, 1978; Bulkin and Groh, 2011) showed that sites with high BFs are rarely found outside of that area. Low-frequency-responsive sites are easier to find because they lie both within the tonotopic region and the surrounding regions (Ryan and Miller, 1978; Bulkin and Groh, 2011).
Figure 3, A–C, shows MR images indicating the location of the sampled region of the IC with respect to the recording chamber. Marking wires placed in the grid for the purpose of the scan facilitated reconstruction of recording/stimulation locations, which are delineated with orange lines superimposed on the MR images. Sample penetrations with tonotopic progressions of frequency tuning are shown in Figure 3, D–G. These multiunit responses varied as a function of tone frequency and depth in the IC, progressing from preferring lower to higher frequencies as the electrode advanced (Fig. 3 D–G, top to bottom).
Location of recording sites within the IC (A–C) and tonotopic organization at example electrode penetrations (D–G). A–C, MRI scans in the coronal plane. The recording cylinders were filled with saline for the scan and can be seen above the brain in white. The dark bands within in the recording cylinders are the plastic grids used to hold the shaft of the electrodes as they advanced toward the IC. Penetration locations could be reconstructed based on the positions of marking wires placed in the grid for the MRI scans. The red lines indicate the medial and lateral extent of our stimulation sites based on these reconstructions. The location of the IC is indicated in B by the dashed circles on the opposite side of the brain for clarity. The rostral/caudal extent of the stimulation sites are indicated by the bar above the relevant sections. The dorsal/ventral dimension in monkeys V and J is explored further in D–G. Each plot indicates the multiunit response properties of an IC site as the animal passively listened to tones. The plots in a given column indicate the responses as the electrode was advanced at semiregular intervals from dorsal to ventral locations. Sites responded to higher and higher frequency sounds with increasing depth, demonstrating tonotopy consistent with the central nucleus of the IC and less consistent with other nearby regions. Tonotopy in monkey M was characterized in a previous study from our group (Bulkin and Groh, 2011).
Stimulation: example sites
Electrical stimulation of the IC influenced the monkeys' performance on the frequency discrimination task and could produce different effects at sites preferring different frequencies. Figure 4 shows two example effects, one at a low-frequency-preferring site (Fig. 4A,B) and the other at a high-frequency-preferring site (Fig. 4C,D). For both of these sites, stimulation shifted the psychometric function away from veridical judgments based on the sound alone toward the BF of the stimulation site. For the low-frequency-preferring site (Fig. 4B), the filled circles/thick line corresponding to stimulation trials lie below those of the corresponding nonstimulated data, reflecting fewer “higher” choices and more “lower” choices. For the high-frequency-preferring site, the stimulated curve is shifted upward (or leftwards; Fig. 4D), reflecting an excess of “higher” choices.
Example effects of stimulation on frequency discrimination. The site illustrated in A and B was sharply tuned to the frequency as assessed with tones, exhibiting a BF of 600 Hz (peak of Gaussian fit; A). When stimulation was applied during performance of a task with a 3000 Hz RF (B), the monkey increased its proportion of “lower” choices and reduced its proportion of “higher” choices. C, D, A different site, tested with band-pass noise, having a BF of 3014 Hz, which was higher than the RF of 827 Hz. Stimulation at this site shifted the psychometric function in favor of higher choices. The effects of stimulation were statistically significant for both sites (Fisher's exact test; B, n = 235 trials, p = 6.1*10–9; D, n = 1318 trials, p = 1.2*10–14).
At other sites, stimulation affected performance in ways that were not correlated with the BF. Figure 5 illustrates two different examples. For the site illustrated in Figure 5, A and B, stimulation impaired performance across the entire tested range of probe frequencies while only modestly shifting it. This can be seen by the flattened slope of the psychometric function on stimulated trials compared with nonstimulated trials (threshold shift of 3.2%). Figure 5, C and D, shows a “wrong-way” effect—a case in which stimulation shifted the psychometric function in a direction inconsistent with the response properties of the site. That is, the site preferred sounds lower in frequency than the RF used for this experiment (Fig. 5C), but stimulation increased the proportion of “higher” choices (Fig. 5D).
Electrical stimulation could also impair performance on the frequency discrimination task (A, B) or shift behavior away from the BF of the neurons at the stimulation site (C, D). The site illustrated in A had a BF of 454 Hz, which was lower than the RF of 827 Hz, but stimulation at this site produced only a small shift in the direction of the BF (B; Fisher's exact test, n = 1169, p = 0.022). Rather, the psychometric function was slightly but consistently flattened on stimulation trials compared with nonstimulated trials. C, D, Example “paradoxical” effect in which the stimulation exerted a shift in the psychometric function in the direction opposite to what would be expected based on the frequency-tuning properties of the site. The site had a BF of 509 Hz, lower than the RF of 827 Hz, but stimulation increased the proportion of high-frequency judgments relative to the control trials (Fisher's exact test, n = 197, p = 0.0039).
Stimulation
Population results as a function of frequency tuning and stimulus parameters
The effects of stimulation as a function of the response properties of the stimulation site across the full population of sites are illustrated in Figures 6 and 7. There was a relationship between the BF and bias in the animal's report of frequency compared with the RF when stimulated at these sites (Fig. 6A, r = 0.16, p = 0.0013, n = 424 experiments, linear regression of SEI vs the difference between best and RFs, expressed as octaves). However, the relationship between the effects of stimulation on performance and the response properties of the site was coarse. Fifty-two percent of the sites produced effects in the expected direction (points in the upper right and lower left quadrants of Fig. 6A), whereas 44% were “wrong-way” effects (points in the upper left and lower right quadrants of Fig. 6A). The remaining 4% lay on either the SEI = 0 or the BF-RF = 0 lines. Accordingly, we evaluated several other aspects of the data to determine whether a better correlation would be seen when taking into account these other factors.
A, Effect of electrical stimulation on frequency discrimination correlated with the difference between BF of the recording site and the RF. Filled circles represent individual experiments; some individual experiments fall outside of the plotted range on this y-axis scale and are plotted as gray circles at the relevant edges of the panels, (stacked so that they can all be seen). Significance at the population level was assessed via linear regression of SEI versus log2(BF) − log2(RF) across the experiments. A positive slope for the population data (r = 0.16; p = 0.0013; n = 424) indicated that the animals reported higher frequency percepts when stimulated at sites where the BF was higher than the RF and the animals reported lower frequency percepts when stimulated at sites where the BF was lower than the RF. B–D, Data subdivided by RF, revealing that the pattern was qualitatively similar for different ranges of RFs.
The effect of electrical stimulation on frequency discrimination correlated with BF of the recording site (A; r = 0.11, p = 0.009), but not with RF (B; r = 0.05, p = 0.27). Again, points outside of the axes are shown in gray circles and are both stacked and slightly jittered horizontally for clarity.
First, we tested whether the relationship between SEI and BF-RF held true for different ranges of BFs (Fig. 6B–D). These data were generally consistent with the pattern of results shown in the dataset as a whole, although the p-values were higher due to fewer data points.
We next tested whether BF alone provided a better account of the stimulation effects than BF-RF. Although there was a statistically significant relationship between SEI and BF (Fig. 7A, r = 0.11, p = 0.009), there was a trend toward this relationship being weaker than that observed when the difference between BF and RF was taken into account (r = 0.11 for BF alone vs r = 0.16 for BF-RF). Finally, the effects of stimulation was uncorrelated with RF alone (Fig. 7B, linear regression, r = 0.05, p = 0.27). We did not identify any obvious additional factor in the stimulus conditions that better accounted for the variability in the effects of stimulation beyond the difference between BF and RF.
Population results as a function of stimulation strength
We next considered the effects of stimulation current strength. It has been theorized that optimal stimulation current must be strong enough to activate a sizeable population of neurons, but not so strong as to activate a population of neurons with heterogenous tuning properties (Murasugi et al., 1993; Groh et al., 1997). Perceptually, increasing current amplitude might make the stimulation sound “louder” and/or broaden the frequency content associated with the stimulation-evoked percept.
Our results are broadly consistent with the efficacy of stimulation increasing with current strength. More intense electrical stimulation increased the proportion of sites for which SEI differed from zero (Fig. 8A, Fisher's exact test, p < 0.05). However, we tended to test higher currents in later sessions in which the monkeys were better practiced at the task. As a result, the mean number of trials per experiment tended to be higher for the higher currents, making the Fisher's test more likely to reach statistical significance. To control for this factor, we also quantified the absolute value of the SEI as a function of current (Fig. 8B). SEIs varied significantly with current strength (ANOVA, n = 424, F = 4.11, p = 0.00022); those tested with 40 μA and above tended to produce slightly larger effects than those tested with 20 μA and below. There was no clear pattern for discrimination threshold effects over this range of currents (Fig. 8C). The largest impairment in frequency discrimination occurred at 40 μA; at 80 μA, there was no significant impairment in frequency discrimination.
Relationship between electrical stimulation current and its effects on frequency discrimination. A, More intense electrical stimulation increased the proportion of sites where its effect was statistically significant (Fisher's exact test, p < 0.05). B, Stimulation tended to produce the largest systematic perceptual shifts at higher stimulation currents (40 μA and above). This panel plots the absolute value of the SEI versus stimulation current. SEI indicates the percentage “high” choices “added” by stimulation, so its absolute value captures both shifts in favor of more high choices and shifts in favor of more low choices. C, Stimulation-induced changes in frequency discrimination threshold as a function of stimulation current (Eq. 2).
Discussion
Previous studies have suggested that stimulation of the IC may be effective for restoration of hearing in patients unable to use cochlear implants (Lenarz et al., 2006b; Colletti et al., 2007; Lim et al., 2008a; Lim et al., 2008b; Lim et al., 2009; Lim and Lenarz, 2015). However, open-set speech reception with IC implants is inferior to levels observed with cochlear implants (Lim et al., 2008a; Lim and Lenarz, 2015). Although this is not surprising for early-stage development of a novel prosthetic (Wilson and Dorman, 2008), it indicates the need for testing in a behaving/perceiving animal model to accelerate the development process (Gerken, 1970; Gerken et al., 1985; Gerken et al., 1991; Rousche et al., 2003; Otto et al., 2005a, b; Znamenskiy and Zador, 2013; Guo et al., 2015; Tsunada et al., 2016). Our study shows that IC stimulation can bias auditory perceptual judgments in monkeys and opens up this animal model as a test bed for improving implant design for humans. Furthermore, the use of monkeys with normal hearing permits comparison of the perceptual effects of stimulation to the normal response properties of the stimulation sites.
A previous physiological mapping study (Bulkin and Groh, 2011) provided insight into a critical problem observed in early IC implant recipients: limited pitch ranking of electrode sites. This problem was attributed to failure to place the electrode array along the tonotopic axis of the IC (Lim et al., 2008a). The tonotopic region may be a small target: only ∼17% of penetration locations passing through the monkey IC encountered a tonotopic gradient (Bulkin and Groh, 2011). The remainder of the frequency-sensitive region of the monkey IC was tuned for a consistent range of low frequencies and did not show much heterogeneity or organization (Fig. 9A, green squares indicate tonotopic penetrations, red squares indicate low-frequency-tuned penetrations; reprinted from Figs. 5 and 6 in Bulkin and Groh, 2011).
The primate IC shows limited topography and broad tuning at individual sites. This figure is reprinted from Figures 5 and 6 from Bulkin and Groh (2011). In that study, we mapped the IC in an awake but nonperforming monkey, characterizing the frequency-tuning properties of multiunit activity at sites spaces 500 μm apart along electrode trajectories passing through the IC and spaced 1 mm apart in a plane slightly tilted from the anterior/posterior and medial/lateral axes (see Bulkin and Groh, 2011 for full details). A, Only about one-fourth of the penetrations passing through tone-responsive regions were classified as tonotopic (green squares vs red ones), or 17% of the auditory responsive region as a whole. Non-tonotopic tone-responsive sites (red squares) responded best to low frequencies. Sites that were responsive but not selective for frequency were also observed (blue squares). Our present stimulation experiments were conducted at frequency selective sites (either low-frequency tuned or tonotopic locations). B, Proportion of sites responding as a function of tone frequency at 50 dB. Because all sites were tested at an overlapping set of frequencies and intensities, estimates of the proportion of the IC responding to a given tone frequency at a fixed, moderate intensity could be obtained. Dots on the right side of this plot show the corresponding information for broadband or white noise stimuli (marked WN).
The most comparable evidence for human tonotopy is strikingly consistent with that of the monkey: an fMRI study that quantified tonotopy using the same statistical method as Bulkin and Groh (2011) identified ∼18% of the IC as tonotopic (Ress and Chandrasekaran, 2013); other studies reported evidence of tonotopy, although they did not quantify its scope (Geniec and Morest, 1971; De Martino et al., 2013). If the human IC is only sparsely tonotopic and variable across subjects (Ress and Chandrasekaran, 2013), it will be challenging to place a single array of electrodes with any consistency in a tonotopic region. Indeed, plans for the upcoming second clinical trial include the implantation of multiple arrays of electrodes (Lim and Lenarz, 2015) and this may increase the likelihood that the electrodes span sites with a range of different BFs.
A second issue that emerged from the physiological mapping study was that tuning of IC neurons to sound frequency was quite broad. For sounds of moderate intensity (50 dB), a tone of 500 Hz elicited responses in 40–80% of the sites tested. Between 400 and 3000 Hz, at least ∼30% of the IC was responsive (Fig. 9B). For broadband sounds, this proportion was higher: at least 60% of IC sites were responsive (dots on right side of plot in Fig. 9B).
Therefore, a challenge for stimulation in the IC compared with the cochlear implant is that stimulation at a single electrode site is likely to activate a very small proportion of the activity that would occur naturally in a hearing subject in response to speech or other environmental sounds. If the IC is a sphere 4 mm in diameter, it has a volume of ∼33 ml. If stimulation at 40 μA activates a volume of between 0.4 and 1 ml (Gross and Rolston, 2008; Kuncel et al., 2008), that corresponds to 0.012–3% of the IC or 0.04–10% of the minimum population that would be activated by a pure tone of moderate intensity—somewhere between 1:2500 and 1:10. In contrast, a single pulse of cochlear implant stimulation above perceptual threshold was estimated to generate 1200 spikes in auditory nerve fibers (Wilson et al., 2014). If every one of these spikes occurs in a different nerve fiber, then that corresponds to 1200 fibers of ∼30,000 total. Although the number of auditory nerve fibers activated by a given sound is not known, it was estimated that ∼5000 fibers may be active for a click of moderate intensity (Kiang et al., 1976). This suggests that cochlear implant signals more closely approach the size of active populations that occurs for actual sound in subjects with normal hearing: 1200/5000 = 24% or 1:4. The implication is that more electrodes placed throughout the 3D structure of the IC will be needed to produce electrically evoked neural signals that are commensurate in scope with those that occur naturally.
In light of these numbers, it is remarkable that we were able to observe stimulation effects at all, much less a predictable if coarse relationship between the frequency-tuning properties of the neurons at the stimulation site and the evoked percept. In auditory cortex, previous studies with electrical stimulation showed that rats can distinguish between stimulation at different electrode sites, that stimulation can substitute for auditory stimuli in task performance, and that performance is correlated with the frequency-tuning properties of electrode sites (Otto et al., 2005a, 2005b). However, a related study using optogenetic stimulation did not always yield a correlation between choice behavior and the frequency-tuning properties of the stimulation site, depending on the particular neural population being targeted (Znamenskiy and Zador, 2013). A recent monkey study found a correlation between frequency tuning and the perceptual effects of microstimulation in one region of auditory cortex, the anterior lateral belt region, but not in the nearby middle lateral belt region (Tsunada et al., 2016). These somewhat different results suggest that the relationship between neural selectivity patterns and perception may not be straightforward and neurons with similar response patterns but different efferent connectivity patterns may play different roles.
Why does the IC have such a large active population in response to any given tone and what does this imply about the functional organization of this structure? It may be that the population is subdivided into groups that, although sensitive to the same incoming stimulus, each contribute to different aspects of auditory perception. For example, a subset of 1000 Hz-responsive neurons might contribute to the perception, not of a tone of that frequency, but rather to the pitch of a harmonic complex for which 1000 Hz is a harmonic (e.g., 1000 Hz is the fifth harmonic of 200 Hz). A region sensitive to the pitch of harmonic complexes has been identified in the auditory cortex of primates (Bendor and Wang, 2005, 2006). Therefore, stimulating that subset of the IC's 1000 Hz-responsive neurons might ultimately activate the 200 Hz-sensitive pitch complex neurons in auditory cortex, evoking a pitch percept of 200 Hz rather than 1000 Hz. Other subsets of neurons may specialize in encoding non-frequency-related attributes of a 1000 Hz tone, such as location or loudness, much as a neuron in visual cortex responds only to stimuli in its receptive field, but may also show selectivity for attributes such as color, contrast, or shape. Because we only assessed performance in frequency discrimination, any stimulation-evoked contributions to non-frequency-related attributes would appear as variability in our results. Again, the use of prosthetics containing a greater number of electrodes spread out over a broader volume of the IC should increase the probability that a sufficient number that contribute perceptual attributes important for speech reception can be identified.
Another factor that may have disrupted the relationship between BF and stimulation effects concerns the relatively low frequency of stimulation in the present study, typically 200 Hz. IC neurons may have phase locked to the stimulation pulses, producing competition between temporal and place codes. In human IC implant recipients, the relationship between stimulation and perceptual frequency saturated at 250 Hz (Lim et al., 2008b); that is, above the frequency that we used. Therefore, our stimulation may have introduced temporal cues to pitch that would not likely correlate with a site's BF and might appear as variability in our results.
In addition, electrical stimulation can affect fibers of passage (Grill et al., 2005). The IC contains an extensive pattern of connectivity within and between laminae (Malmierca et al., 1995). Our recordings were likely dominated by signals emanating from the cell bodies and overlooked the frequency-tuning properties of these fibers, a potentially considerable source of variability. Human IC implants likely also activate fibers of passage, contributing to challenges in pitch-ranking electrode sites. Optogenetic approaches might one day allow greater specificity and resolve the issues related to fibers of passage, but this technology is currently unavailable for this type of therapeutic use in humans and many safety issues have yet to be resolved (Gilbert et al., 2014)
Our work demonstrates the viability of a behaving primate model for comparison of electrically evoked percepts to real sounds. Furthermore, the chronic-acute preparation enabled testing of a large number of sites. These two factors make the model attractive for testing future innovations before use in humans. Although the present study used stimuli with limited spectrotemporal properties, future work could be expanded to include stimuli along any dimension that the animals can perceive.
The behavioral paradigm used here is appropriate for testing in any auditory brain region. Of particular relevance is the cochlear nucleus, another brain area that is already in widespread use as a human prosthetic site (the auditory brainstem implant), but which has been considered only a partial success (Schwartz et al., 2008; Colletti et al., 2012; Vincent, 2012). To our knowledge, the tonotopic organization of the cochlear nucleus has yet to be quantitatively mapped with physiological or imaging techniques in either humans or monkeys, thus making the problem of ensuring that electrodes are placed across a suitable range of BFs even more challenging. The behaving monkey model developed here may lead to improvements in such implants.
Footnotes
This work was supported in part by the National Institutes of Health (NIH Grant R01EY016478-01 to J.M.G.), the Duke Institute for Brain Sciences (Incubator Award to J.M.G., W.M.G., and Nell Cant), Duke WISeNet (Graduate Fellowship to D.S.P. sponsored by National Science Foundation Grant NSF DGE-1068871). We thank Tom Heil, Jessi Cruger, Karen Waterstradt, Christie Holmes, and the Duke Division of Laboratory Animal Resources for expert technical assistance; Bao Tran-Phu, Aida Ibrahim, Sydney J. Koke, Jessi Cruger, Laura Paulsen, and Diana Friedeberg for assistance with animal training and data collection; and Nell Cant, Deborah Tucci, Kurtis Gruters, and Valeria C. Caruso for insightful commentary and discussion.
The contents of this manuscript are solely the responsibility of the authors and do not necessarily represent the official views of NIH or NSF. B.S.W. serves as a consultant for MED-EL, GmbH, in Innsbruck, Austria, a manufacturer of auditory prosthetics. The remaining authors declare no competing financial interests.
- Correspondence should be addressed to either Daniel S. Pages or Jennifer M. Groh, Center for Cognitive Neuroscience, Duke University, LSRC Rm B203, Durham, NC 27708. dspages{at}gmail.com or jmgroh{at}duke.edu