A fundamental task of the auditory system is to encode time-varying sounds in the environment around us. For instance, when clicks are repeated in a train, we can discern each click as long as the presentation rate is below a certain threshold. Above that threshold, we perceive a continuous sound like a buzz, hum, or pitch. Along the auditory pathway, neurons progressively lose their temporal fidelity: the auditory nerve can phase lock or synchronize to stimulus presentation rates >1 kHz, but the synchronization rate of neurons in auditory cortex falls to ∼30–50 Hz. Auditory cortex therefore uses a combination of stimulus-synchronized and nonsynchronized (NS) population responses to encode temporal information with nonsynchronized responses that increase the spike rate (a rate code) with temporal features that vary too fast to be represented by synchronized responses (a temporal code; Lu et al., 2001; Joris et al., 2004). Why we perceive slow, repeating acoustic events (<40 Hz), such as an idling engine, as distinct events and fast ones, such as a revving motorbike, as a continuous sound is thought to be a direct result of how the signals are represented by synchronized and nonsynchronized cortical neurons, respectively (Bendor and Wang, 2007).
Cochlear implants (CIs) can provide temporal information about sound through modulated electric pulse trains delivered to the auditory nerve. However, only synchronized cortical responses to slowly repeating CI stimulation have been reported so far (Schreiner and Urbas, 1988; Middlebrooks, 2008). In a recent study in The Journal of Neuroscience, Johnson et al. (2017) endeavored to determine how rapid modulations in CI stimulation are represented in primary auditory cortex (A1) and how this coding scheme relates to acoustic sound. The authors implanted a CI in one ear of marmosets and recorded the spike activity of single neurons in both hemispheres in response to either CI or acoustic stimulation delivered in an alternating manner to the CI-implanted ear (right ear) or the intact ear (left ear). In this way, they could compare the response of each neuron to time-varying acoustic and CI stimuli in A1 of the same animal. The acoustic stimuli were short Gaussian click trains at various sound levels and presentation rates, and the CI stimuli were trains of electric pulses at varying current levels and repetition rates. Extensive effort was made to identify stimulus parameters (e.g., frequency/electrode position, sound/current level, stimulus rate) that could drive single-unit firing. The reported averaged response is therefore the sum of individual neurons responding to different preferred stimuli.
Johnson et al. (2017) report two major results. First, they found A1 populations that showed nonsynchronized firing in response to rapid CI stimulus trains. This has not been reported in previous CI physiology studies. The fact that they use an awake rather than a anesthetized animal as in earlier studies (Schreiner and Urbas, 1988; Middlebrooks, 2008), is the likely explanation for this finding. It is not surprising per se, but the absence of nonsynchronized populations in previous studies has puzzled the field because CI users can perceive pitch with an increasing repetition rate (described below). The second finding was that A1 neurons responded to time-varying acoustic and CI stimulation using the same coding scheme (Johnson et al., 2017). Across a wide range of temporal modulations, both stimulus modalities were represented by a combination of synchronized and nonsynchronized population. Interestingly, the distribution and the response boundaries (the presentation rate at which synchronized responses transition to nonsynchronized responses) for acoustic and CI stimuli were similar for each population type. Furthermore, there were no differences in best frequency or laminar distribution. Based on this, the authors propose that A1 neurons process temporal auditory information independently of the nature of the stimulus (acoustic or electric).
There are at least two issues with the authors' second interpretation. First, it is difficult to draw conclusions about the population level, because the reported averaged activity is the sum of individual neurons responding to different preferred stimuli. This is analogous to the sound of individual instruments playing their favorite piece of music rather than the output of a symphony orchestra playing the same notes. The problem with such “custom-made” stimulation schemes was recently highlighted in an optogenetic study, which showed that the conclusions we draw about the role of a single neuron can be sensitive to how it was manipulated by various experimental parameters (Phillips and Hasenstaub, 2016). To support the findings by Johnson et al. (2017); it would be useful to record responses to each stimulus from many neurons at the same time and determine whether the general population and the single-unit responses are complementary. The second issue is that Johnson et al. (2017) report that time-varying acoustic and CI input engage A1 neurons in a similar way in terms of, for example, population distribution, firing rate, and cortical depth. In contrast, using the same four CI-implanted animals, the authors previously found that the two modalities yielded distinct responses: CI stimulation did not activate A1 neurons as efficiently as acoustic stimuli; CI-responsive neurons had different frequency response areas than CI nonresponsive neurons; and CI nonresponsive neurons were actively suppressed, rather than simply not being activated by the CI (Johnson et al., 2016). Surprisingly, the authors do not link the two articles. One explanation could be that the article by Johnson et al. (2017) studies neurons that respond to the CI stimulation and not those that do not. Alternatively, that CI stimulation might engage A1 neurons differently when it comes to spectral (Johnson et al., 2016) versus temporal (Johnson et al., 2017) stimulation paradigms.
What is the neural mechanism underlying the generation of the synchronized and nonsynchronized responses and the transition from one to the other (at the temporal-to-rate response boundary)? And why would it be similar for acoustic and electric hearing? It is well established that a finely tuned balance between inhibition and excitation is needed to shape and refine cortical dynamics over time (Wehr and Zador, 2005). In computational models, Bendor (2015) and Gao et al. (2016) showed that strong excitation and delayed inhibition produced the synchronized responses, whereas weak net excitation due to concurrent excitation and inhibition could generate nonsynchronized responses when acoustic sound was played (Bendor, 2015; Gao et al., 2016). As acoustic and CI stimulation analogously engaged A1 neurons in the study by Johnson et al. (2017), it is likely that the brain uses a comparable excitatory–inhibitory interplay to interpret time-varying electric stimulation. Nonetheless, the role of inhibition in this process has so far not been demonstrated directly in vivo. Parvalbumin-positive interneurons are known for promoting temporal precision (Wehr and Zador, 2005). They might therefore be key controllers of, for instance, the temporal-to-rate code transition.
Across auditory, visual, and somatosensory systems, the neural coding and perceptual boundary for repetition rates generating a sensation of discrete and continuous events is ∼40–60 Hz. At least in the auditory system, this it thought to be a result of the temporal integration window of A1 neurons of 25 ms (Bendor and Wang, 2007). With slower repetition rates, only one event occurs within this window and can thus be represented by a phase-locking spike to each event. Faster repetition rates produce multiple events in the integration window and are consequently represented by the firing rate. Matching strategies for neural–perceptual coding across sensory systems is likely to be important for cross-modal integration, discrimination, and plasticity. For instance, subjects trained to discriminate tactile intervals also got better at discriminating sound intervals (Nagarajan et al. 1998). In light of this consistency across sensory systems, it makes sense that Johnson et al. (2017) would find similar coding schemes for acoustic and electric hearing.
People with normal hearing can perceive an increase in pitch with either an increase in presentation rate of short acoustic clicks (rate pitch) or by an increase in frequency (place pitch). The relative importance of rate and place pitch are nonetheless still debated because the rate of mechanical stimulation of the basilar membrane is strongly correlated with position. For instance, low presentation rates produce vibrations in the apical basilar membrane where low frequencies are encoded and high rates produce vibrations in the basal end where high frequencies are encoded. In this way, the upper limit of temporal pitch in people with normal hearing is per se the upper hearing range of 20 kHz. The two stimulus variables, rate and place, can, however, be controlled independently in a CI: different rates can be applied to different electrodes position along the basilar membrane. CI users can only perceive an increase in pitch with increasing stimulus rate up to 300 Hz, also known as the “300 Hz limit” (McKay et al., 1994). The neural basis for this ceiling effect has been unknown for decades. Interestingly, Johnson et al. (2017) observed that NS neuron firing increased with increasing CI stimulus rate and plateaued at ∼257 Hz. Combined with the fact that marmosets have similar pitch perception properties and organization of auditory cortex as humans (de la Mothe et al., 2006; Song et al., 2016), this led the authors to propose that NS population activity is a strong neural candidate to encode rate–pitch perception in CI users and to determine the long unexplained 300 Hz limit (Johnson et al., 2017). Nevertheless, recent literature suggests that there is no 300 Hz limit in CI users; instead the upper limit of pulse rate discrimination is dependent on the CI stimulation parameters used (Venter and Hanekom, 2014). For instance, stimulating auditory nerve fibers near the tip of the cochlea improved phase locking in cats (Middlebrooks and Snyder, 2010), and Macherey et al. (2011) showed that the upper limit of rate pitch in human CI users could be extended somewhat with asymmetric pulses compared with standard symmetric pulse shapes (Macherey et al., 2011). Also, stimulating multiple electrodes at the same time yields a better rate discrimination than stimulating a single electrode (Venter and Hanekom, 2014). Thus, if NS neurons are the neural basis of rate coding in the auditory cortex, any such stimulation means that elevation to the 300 Hz limit should be mirrored in elevated plateaus in NS neuron firing. Consequently, NS neurons could be an attractive target of future stimulus design based on a repetition rate that might lead to better pitch perception in human CI users.
Until the two above-mentioned issues have been investigated, concluding that auditory cortex encodes temporal information from acoustic and cochlear implant stimulation in a similar way seems premature. Nonetheless, Johnson et al. (2017) provide valuable insight into how the brain processes temporal information from a CI and acoustic sound at the single-neuron level. Indeed, a dialogue between animal research and human psychophysics is needed to optimize the future design of CI processing strategies and eventually to improve the perception of speech and music in human CI users.
Footnotes
Editor's Note: These short reviews of recent JNeurosci articles, written exclusively by students or postdoctoral fellows, summarize the important findings of the paper and provide additional insight and commentary. If the authors of the highlighted article have written a response to the Journal Club, the response can be found by viewing the Journal Club at www.jneurosci.org. For more information on the format, review process, and purpose of Journal Club articles, please see http://jneurosci.org/content/preparing-manuscript#journalclub.
We thank Eva Meier Carlsen and Jean-Marc Edeline for critically reading the manuscript and providing helpful comments.
- Correspondence should be addressed to Charlotte Amalie Navntoft, Department of Biomedicine, Basel University, Klingelbergstrasse 50–70, Room 7014, 4056 Basel, Switzerland. charlotte.navntoft{at}unibas.ch