Abstract
The functional organization of human auditory cortex can be probed by characterizing responses to various classes of sound at different anatomical locations. Along with histological studies this approach has revealed a primary field in posteromedial Heschl's gyrus (HG) with pronounced induced high-frequency (70–150 Hz) activity and short-latency responses that phase-lock to rapid transient sounds. Low-frequency neural oscillations are also relevant to stimulus processing and information flow, however, their distribution within auditory cortex has not been established. Alpha activity (7–14 Hz) in particular has been associated with processes that may differentially engage earlier versus later levels of the cortical hierarchy, including functional inhibition and the communication of sensory predictions. These theories derive largely from the study of occipitoparietal sources readily detectable in scalp electroencephalography. To characterize the anatomical basis and functional significance of less accessible temporal-lobe alpha activity we analyzed responses to sentences in seven human adults (4 female) with epilepsy who had been implanted with electrodes in superior temporal cortex. In contrast to primary cortex in posteromedial HG, a non-primary field in anterolateral HG was characterized by high spontaneous alpha activity that was strongly suppressed during auditory stimulation. Alpha-power suppression decreased with distance from anterolateral HG throughout superior temporal cortex, and was more pronounced for clear compared to degraded speech. This suppression could not be accounted for solely by a change in the slope of the power spectrum. The differential manifestation and stimulus-sensitivity of alpha oscillations across auditory fields should be accounted for in theories of their generation and function.
SIGNIFICANCE STATEMENT To understand how auditory cortex is organized in support of perception, we recorded from patients implanted with electrodes for clinical reasons. This allowed measurement of activity in brain regions at different levels of sensory processing. Oscillations in the alpha range (7–14 Hz) have been associated with functions including sensory prediction and inhibition of regions handling irrelevant information, but their distribution within auditory cortex is not known. A key finding was that these oscillations dominated in one particular non-primary field, anterolateral Heschl's gyrus, and were suppressed when subjects listened to sentences. These results build on our knowledge of the functional organization of auditory cortex and provide anatomical constraints on theories of the generation and function of alpha oscillations.
Introduction
Human primary auditory cortex occupies the posteromedial portion of Heschl's gyrus (HG) and can be distinguished from neighboring non-primary fields based on architectonic and electrophysiological features (Hackett et al., 2001; Clarke and Morosan, 2012; Nourski and Howard, 2015). Responses to sound in primary auditory cortex phase-lock to rates beyond 100 Hz and include characteristic short-latency evoked components (Liégeois-Chauvel et al., 1991; Howard et al., 2000; Brugge et al., 2008, 2009). They also contain pronounced induced power in the high gamma range (Steinschneider et al., 2008; Brugge et al., 2009; 70–150 Hz), which is considered a proxy for multiunit spiking activity (Crone et al., 2001; Mukamel et al., 2005; Manning et al., 2009).
The anatomical distribution and stimulus-sensitivity of ongoing low-frequency oscillations in auditory cortex have been less well characterized, but there are several reasons to expect the strength of alpha-band (broadly defined as 7–14 Hz) activity in particular also to vary across primary and non-primary auditory cortex. First, cortical high-gamma and alpha activity often exhibit an antagonistic relationship (Crone et al., 1998; Mukamel et al., 2005; Ramot et al., 2012). For example, alpha power in visual cortex is suppressed as high gamma activity increases upon afferent stimulation (Hoogenboom et al., 2006; Bauer et al., 2014). Second, primary and non-primary auditory fields have distinct architectonic and connectivity profiles (Clarke and Morosan, 2012), which may give rise to intrinsic dynamics at different timescales (Başar and Güntekin, 2008; Honey et al., 2012; Murray et al., 2014). Third, alpha oscillations have been associated with a range of neural operations and processes that may differentially engage lower versus higher levels of the cortical hierarchy (Clayton et al., 2018). These include functional inhibition (Klimesch et al., 2007; Jensen and Mazaheri, 2010), inter-areal synchronization (Palva et al., 2010), and the communication of sensory predictions (Bauer et al., 2014; Sedley et al., 2016; Auksztulewicz et al., 2017; Chao et al., 2018).
Most of what we know of human alpha oscillations derives from study of the dominant occipital and parietal sources detectable in scalp electroencephalography (EEG), particularly when the eyes are closed or during manipulations of visuospatial attention (Berger, 1931; Clayton et al., 2018). However, ongoing oscillations at the lower end of the alpha frequency range (7–10 Hz) have also been recorded from superior temporal cortex. These show power decreases during auditory stimulation (Niedermeyer, 1990; Tiihonen et al., 1991; Lehtelä et al., 1997; Gomez-Ramirez et al., 2011; Weisz et al., 2011; Fontolan et al., 2014; A. Keitel and Gross, 2016), the extent of which is sensitive to the nature of the stimulus, for example with clear speech eliciting a larger power suppression compared with noisy speech (Obleser and Weisz, 2012; de Pesters et al., 2016). There are indications that the magnitude of ongoing alpha oscillations (Frauscher et al., 2018) and of their suppression during stimulation (Fontolan et al., 2014) is lower on HG than elsewhere in the temporal lobe. However, recording sites in the relevant studies were not localized to primary versus non-primary auditory cortex based on known electrophysiological response properties.
To more precisely characterize the spatial distribution and time course of low-frequency oscillations, including alpha, in human auditory cortex, we recorded from eight hemispheres in seven adult patients implanted with electrodes along the length of HG and elsewhere on the superior temporal plane and lateral superior temporal gyrus, during clinical monitoring for epilepsy. This coverage allowed analysis of local field potentials in both primary and non-primary auditory fields, defined both anatomically and based on responses to click trains, while subjects listened to clear and degraded speech. Our objective was not to directly test particular accounts of alpha generation or function, but rather to provide anatomical specificity that may constrain the development of these theories, particularly as they relate to auditory processing.
Materials and Methods
Subjects.
Subjects were seven individuals (4 females; median age: 33 years; age range 22–56 years) undergoing intracranial monitoring for diagnosis and treatment of medically intractable epilepsy. Recordings were made in an electromagnetically shielded hospital room at the Epilepsy Monitoring Unit at the University of Iowa Hospitals and Clinics. Three individuals had electrode placement in the left hemisphere only (Subjects L357, L403, L442), three in the right hemisphere only (Subjects R369, R399, R429), and one bilaterally (Subject B335). Further details of electrode placement are provided in the “Data recording” section and in Figure 1. All subjects were right-handed with left-hemisphere language dominance as determined by preimplantation Wada testing. Three subjects (B335, R399, L442) had normal hearing (pure tone thresholds <20 dB hearing level for frequencies between 250 and 4000 Hz) and four (L357, R369, L403, R429) had mild hearing loss at isolated frequencies (there was no systematic difference in results between these two subgroups). Vision was self-reported as normal or corrected to normal; participants who required glasses wore them during the task. All participants were native speakers of English. Subject details, including demographic data, seizure focus, details of hearing loss, and the number of contacts in each studied field, are provided in Table 1.
Research protocols were approved by the University of Iowa Institutional Review Board, and subjects signed informed consent documents before any recordings. Research did not interfere with acquisition of clinical data, and subjects could withdraw from research at any time without consequence for their clinical monitoring or treatment. Subjects initially remained on their antiepileptic medications but these were typically decreased in dosage during the monitoring period at the direction of neurologists until sufficient seizure activity had been recorded for localization, at which point antiepileptic medications were resumed. No research occurred within 3 h following seizure activity. Electrode contacts at seizure foci were excluded from all analyses.
Stimuli.
Stimuli for determining primary versus non-primary sites were 160 ms long click trains, presented at rates of 25, 50, 100, 125, 150, and 200 Hz. 50 click trains were presented at each rate, in random order and with intervals between onsets of successive click trains drawn from a normal distribution with mean 2000 ms and SD 10 ms. Stimuli for the main experiment were clear and noise-vocoded versions of English sentences previously used by Wild et al. (2012). They were recorded by a female native speaker of North American English in a soundproof booth using an AKG C1000S microphone with 16-bit sampling at 44.1 kHz using an RME Fireface 400 audio interface. Three-band noise-vocoding was performed as described by Shannon et al. (1995) using a custom-designed vocoder implemented in MATLAB (MathWorks; RRID:SCR_001622). In detail, items were filtered into three contiguous frequency bands (50–558, 558–2264, and 2264–8000 Hz, selected to have equal spacing along the basilar membrane), using FIR Hann bandpass filters with an 801-sample-window length. The amplitude envelope in each frequency band was extracted using full-wave rectification followed by low-pass filtering at 30 Hz with a fourth-order Butterworth filter. These envelopes were applied to bandpass filtered noise in the same frequency ranges, the results of which processing were summed to produce the noise-vocoded sentence. Clear speech remained unprocessed, containing all frequencies up to 22.05 kHz. Finally, the entire set of clear and noise-vocoded sentences was normalized for root-mean-square intensity.
Procedure.
Subjects took part while sitting upright in their hospital bed. Sounds were presented diotically at a comfortable listening level via insert earphones (ER4B, Etymotic Research) integrated into custom-fit earmolds. For the presentation of click trains, subjects relaxed and performed no task. For the main experiment, subjects fixated on the center of a screen (ViewSonic VX922 or Dell 1707FPc) positioned ∼60 cm in front of them and made responses on a computer keyboard. Each trial consisted of an initial auditory sentence presentation (either clear or degraded; duration 1223–4703 ms), a delay (1250–1500 ms), a written sentence (2500–2750 ms), a further delay (1250–1500 ms), a repeat presentation of the auditory sentence, and a further delay (2300–2800 ms). During the delay periods a fixation cross was displayed in the center of the screen. The written sentence either matched, or was completely different to, the content of the spoken sentence. To facilitate attention to the auditory stimuli, subjects were told that they would be asked about the content of the spoken sentences at the end of the experiment. To encourage attention to the text, subjects were asked to report occasional capital letters that occurred in 9% of trials (excluded from analysis) using the hand ipsilateral to the hemisphere with the most electrode contacts. Stimulus presentation and response collection was controlled by Presentation (Neurobehavioral Systems; RRID:SCR_002521) running on a Windows PC. The speech experiment lasted between 30 and 45 min, depending on the number of trials completed (range 120–136) and whether the subject opted to take a break. The manipulation of the visual cue was included to answer a separate research question concerning the neural basis of perceptual pop-out of degraded speech; related results including performance in the behavioral tasks will be reported elsewhere.
Data recording.
Electrophysiological activity was recorded using depth and subdural electrodes (Ad-Tech Medical). Subdural grids with coverage including superior temporal gyrus consisted of platinum-iridium discs (2.3 mm diameter, 5–10 mm inter-contact distance) embedded in a silicon membrane. Depth arrays of 8–12 recording contacts spaced at distances of 5 mm were implanted stereotactically, targeting HG. In some patients, additional arrays targeting the insula provided coverage of other sites in the superior temporal plane (Nagahama et al., 2018). A subgaleal contact served as a reference. Electrode placement was determined solely on the basis of clinical requirements, as determined by the neurosurgery and epileptology team (Nourski and Howard, 2015). Data acquisition was via a TDT RZ2 real-time processor (Tucker-Davis Technologies) for subjects B335 and L357, and via a NeuraLynx Atlas System (NeuraLynx) for the remaining subjects. Data were amplified, filtered (TDT: 0.7–800 Hz bandpass, 12 dB/octave roll-off; NeuraLynx: 0.1–500 Hz, 5 dB/octave roll-off, with the exception of subject R369 for whom filtering was at 0.8–800 Hz), and digitized with a sampling rate of 2034.5 Hz (TDT) or 2000 Hz (NeuraLynx).
Anatomical reconstruction of implanted electrode locations, mapping to a standardized coordinate space, and assignment to regions of interest were performed using FreeSurfer image analysis suite v5.3 (Martinos Center for Biomedical Imaging; RRID:SCR_001847) and custom software, as described previously (Nourski et al., 2014). In brief, whole-brain high-resolution T1-weighted structural magnetic resonance imaging (MRI) scans (resolution and slice thickness ≤1.0 mm) were obtained from each subject before electrode implantation. After implantation, subjects underwent MRI and thin-slice volumetric computerized tomography (CT; resolution and slice thickness ≤1.0 mm) scanning. Electrode locations were initially extracted from post-implantation MRI and CT scans, then projected onto preoperative MRI scans using nonlinear three-dimensional thin-plate spline morphing, aided by intraoperative photographs. Where required, standard Montreal Neurological Institute coordinates were found for each contact using linear coregistration to the ICBM152 atlas, as implemented in FMRIB Software Library v5.0 (FMRIB Analysis Group; RRID:SCR_002823).
Data analysis.
Off-line data analysis was performed in MATLAB v2012b (MathWorks; RRID:SCR_001622) using the Fieldtrip Toolbox v20131231 (http://www.fieldtriptoolbox.org/; RRID:SCR_004849; Oostenveld et al., 2011) and custom MATLAB scripts. Power line noise was removed using a filter based on the demodulated band transform (Kovach and Gander, 2016). Data were then downsampled to 1000 Hz, and recordings divided into epochs from −1500 to 6300 ms relative to sentence onset.
Responses to click trains were investigated using intertrial phase coherence (ITPC; Lachaux et al., 1999), which indexes the strength with which neural activity synchronizes (phase locks) with temporally regular patterns in acoustic stimulation (Herrmann and Johnsrude, 2018; C. Keitel et al., 2019). To this end, a fast Fourier transform (including a Hann window taper and zero-padding) was calculated for frequencies ranging from 20 to 210 Hz using the data in the 0–0.2 s time window (i.e., the click-train duration). The resulting complex numbers were normalized by dividing each by its magnitude. ITPC was then calculated as the absolute value of the mean normalized complex number across trials. ITPC can take on values between 0 (no coherence) and 1 (maximum coherence). ITPC was calculated separately for each contact and repetition rate (50, 100, 125, 150, 200 Hz). We also calculated ITPC based on surrogate data to compare the empirically observed ITPC values against ITPC chance levels (Stam, 2005). In detail, time series data (0–0.2 s after stimulus onset) were converted to the frequency domain using an FFT; the phase of each frequency component was randomized, and the data were subsequently converted back to the time domain. ITPC was then calculated for the surrogate time series. Calculation of surrogate data and ITPC was repeated 100 times, and ITPC values were averaged subsequently, leading to a chance-level ITPC.
For the main experiment, time-frequency analysis of oscillatory activity was conducted separately for low-frequency (2–30 Hz) and high-frequency (40–180 Hz) activity using Morlet wavelets (Tallon-Baudry et al., 1996; Tallon-Baudry and Bertrand, 1999). In detail, for each sentence and contact, time-frequency representations were calculated for the −0.6 to 1.2 s time window relative to stimulus onset in 10 ms steps. We limited the endpoint of our analysis time window to 1.2 s to ensure that our analysis would be related to ongoing activity during stimulus presentation (1.223 s was the duration of the shortest sentence). For frequencies between 2 and 30 Hz (calculated in steps of 0.2 Hz), wavelet size linearly increased from 3 to 12 cycles as a function of frequency. For frequencies between 40 and 180 Hz (calculated in steps of 1 Hz), a wavelet size of 12 cycles was used uniformly. Power was calculated as the squared magnitude of the complex wavelet coefficients and averaged across trials. Time-frequency power was baseline-corrected by dividing the power at each time point by the mean power in the −0.6 to −0.1 s prestimulus time window, taking the logarithm (base 10), and multiplying the result by 10 (separately for each frequency). The result is power in decibel units, reflecting the signal change from baseline.
Functional localization.
Recording sites were identified as belonging to a primary region of interest in posteromedial HG (pmHG) if they showed significant phase-locked responses to 100 Hz click trains, and if averaged click-evoked potentials included short-latency (<20 ms) components (Brugge et al., 2009; Nourski et al., 2016). Other sites along the gyrus that did not demonstrate these properties were deemed to be in non-primary cortex and assigned to the anterolateral HG (alHG) region of interest.
Experimental design and statistical analysis.
Nonparametric Wilcoxon signed rank tests were calculated using the signrank function in MATLAB. False discovery rate was used to correct for multiple comparisons (Benjamini and Hochberg, 1995; Genovese et al., 2002), if not indicated otherwise. Effect sizes are reported as re (requivalent; Rosenthal and Rubin, 2003), which is equivalent to a Pearson product-moment correlation for two continuous variables, to a point-biserial correlation for one continuous and one dichotomous variable, and to the square root of partial η2 for ANOVAs. Data are available on request. The experiment was not preregistered.
Software accessibility
Custom code is available on request.
Results
Functional segregation into primary and non-primary auditory cortex
A posteromedial-anterolateral boundary along HG between primary and non-primary auditory cortex was identified for each hemisphere (Fig. 1A), based on responses to click trains. As per our definition, phase-locking at 100–150 Hz click stimulation rates was significantly greater at posteromedial than at anterolateral contacts (Fig. 1B,C). It is worth re-emphasizing previous work showing that neural activity in cortical as well as subcortical brain regions synchronizes with such high stimulation rates (Nourski et al., 2013, 2016; Coffey et al., 2016; Holmes and Herrmann, 2017).
Alpha-power suppression and high gamma-power enhancement dominate in anterolateral and posteromedial HG, respectively
Figure 2 shows activity from 1 s before sentence onset until 1 s after sentence onset for two exemplary trials at three pairs of contacts, each consisting of one pmHG and one alHG contact from a single subject and hemisphere. A striking feature is the prominent low-frequency prestimulus activity in alHG that is reduced following sentence onset. This observation is supported by a group analysis. Figure 3A displays time-frequency power for the low-frequency range (2–30 Hz) for pmHG and alHG, averaged across all recording sites. Motivated by previous research on alpha oscillatory activity in the temporal lobe, we focused on the 7–10 Hz frequency band (Tiihonen et al., 1991; Lehtelä et al., 1997); note however that low-frequency suppression was not limited to this band (Fig. 3A).
Alpha-power was significantly lower in alHG compared with pmHG (p = 0.008, re = 0.812; Fig. 3B). At the individual hemisphere level, this effect was significant in 7 of 8 cases (p ≤ 0.05, FDR-corrected; permutation procedure, where the pmHG and alHG labels were shuffled for single trials, with 1000 repetitions). The time course for the alpha frequency band was extracted and is shown in Figure 3C. Alpha-power suppression after sentence onset (relative to the prestimulus period) was significantly stronger in alHG compared with pmHG for the duration of the 0.5–1.2 s analysis window (p ≤ 0.05, FDR-corrected; Fig. 3C, black solid line). Note that although our analysis focused on 7–10 Hz, motivated by previous work and by the peak in the spectrum (Fig. 3D), the suppression of power in alHG relative to pmHG was also present for lower frequencies. To investigate whether the suppression of alpha power persisted for the duration of the sentences, which were of variable length, we also analyzed the data time-locked to sentence offset (baseline-corrected using the 0.1–0.6 s time window post-sentence offset). Alpha power (7–10 Hz; −1.2 to −0.5 s) remained significantly lower in alHG than in pmHG (p = 0.023, re = 0.737; not plotted) and was below the post-sentence baseline.
The pattern of activity across primary and non-primary cortex in the high gamma range had a different profile. Figure 3E–H shows that the strength of high gamma responses (70–150 Hz) to sentences was greater in pmHG compared with alHG (p = 0.023, re = 0.737) from shortly after stimulus onset, consistent with previous findings for click trains and single syllables (Brugge et al., 2009; Steinschneider et al., 2014).
Alpha-power suppression throughout superior temporal cortex decreases with distance from anterolateral HG
We wanted to establish whether alpha-power suppression decreases with distance from alHG throughout the superior temporal plane and superior temporal gyrus, which would suggest that the main local source of alpha oscillations is located in alHG. Figure 4A shows alpha power (0.5–1.2 s; 7–10 Hz) relative to the prestimulus time window for all superior-temporal-cortex contacts as a function of spatial (Euclidean) distance to the mean coordinate of each subject's alHG contacts. For each hemisphere, we fitted a linear function to the alpha-power values as a function of spatial distance from this point and tested the estimated linear coefficient against zero. Alpha-power suppression decreased as spatial distance from alHG increased (p = 0.023, re = 0.737), suggesting that the main source of auditory alpha-power suppression lies in alHG.
To further visualize activity centers of alpha-power suppression and compare them to centers of high gamma power enhancement, alpha- and high gamma-power for each contact were projected onto a template brain (and left-hemisphere contacts were mapped onto the right hemisphere; smoothing 6 mm FWHM; Fig. 4B). High gamma-power enhancement (relative to baseline) was most prominent in posteromedial auditory cortex, but also present in the posterior part of the lateral surface of the temporal cortex. Alpha-power suppression (relative to baseline) was strongest in more anterior areas.
Alpha-power suppression in anterolateral HG is greater for clear than for noise-vocoded speech
Previous EEG work suggests that alpha-power suppression is stronger for clear compared with noise-vocoded speech (Obleser and Weisz, 2012). Estimated sources of this effect include extensive parieto-occipital and anterior temporal regions, but the resolution of this localization using scalp recordings is limited. Figure 5 shows that alpha-power suppression in the present data was larger for clear speech compared with noise-vocoded speech in alHG (p = 0.039, re = 0.692), but not in pmHG (p = 0.844, re = 0.077), although the stimulus by region interaction was not significant (p = 0.148, re = 0.523). This indicates that alpha suppression in at least one auditory cortical field is dependent on the spectral quality of the stimulus. At the individual hemisphere level, larger alpha-power suppression for clear speech in pmHG was not significant in any cases and in alHG was significant in 3 of 8 cases (p ≤ 0.05, FDR-corrected; permutation procedure, where the speech and noise labels were shuffled for single trials, with 1000 repetitions).
Prestimulus alpha power is greater in anterolateral than in posteromedial HG
Next, we examined the degree to which prestimulus power contributes to power differences between pmHG and alHG observed after stimulus onset. Power in the alpha-frequency band (7–10 Hz) was larger in alHG than in pmHG during the prestimulus time window (−0.6 to −0.1 s; p = 0.008, re = 0.812), whereas there was no difference in alpha power between alHG and pmHG for the poststimulus-onset time window (0.5–1.2 s; p = 0.742, re = 0.128; Fig. 6A). The baseline-corrected results presented in Figure 3B demonstrate that the interaction between time period and region was significant.
High gamma power (70–150 Hz) was larger in pmHG compared with alHG in both the prestimulus (−0.6 to −0.1 s; p = 0.039, re = 0.692) and in the poststimulus-onset (0–1.2 s; p = 0.039, re = 0.692) intervals (Fig. 6B). The baseline-corrected results presented in Figure 3F show that the time period by region interaction was also significant for high gamma power, with the difference in high gamma power between the two regions increasing after stimulus onset.
Alpha-power suppression in anterolateral HG reflects changes in both spectral slope and narrowband alpha power
Finally, we wanted to establish whether the suppressed low-frequency activity in alHG reflected a reduction in the strength of a narrowband oscillation that is present before stimulus onset, or could be better characterized as a change in the exponent χ (“slope”) of the scale-free 1/fχ component of the power spectrum. The latter may reflect properties of neural circuits distinct from those that generate narrowband oscillations (He, 2014; Gao, 2016; Podvalny et al., 2015; Voytek et al., 2015; Becker et al., 2018). The analyses described so far, in which power at each frequency was normalized by its value during a prestimulus baseline, cannot differentiate between these two contributions to neural power spectra.
To address this issue, for each hemisphere and region (pmHG, alHG; power averaged across contacts), we separately estimated the linear slope of the spectrum (on a log-log scale) and the residual narrowband oscillatory power after whitening by removal of the 1/fχ component (i.e., the slope). The slope of the spectrum was shallower during sound presentation compared with the prestimulus period in alHG (p = 0.039, re = 0.692), but not in pmHG (p = 0.742, re = 0.128; interaction: p = 0.008, re = 0.812; Fig. 7A), accounting for some proportion of the low-frequency power changes described earlier. Critically, an additional narrowband oscillatory peak at ∼8 Hz was present in the whitened power spectrum (i.e., after slope removal) in both regions during the prestimulus period (Fig. 7B). Power in the alpha (7–10 Hz) frequency band was larger in the prestimulus-onset time window (−0.6 to −0.1 s) compared with power during sound presentation (0.5–1.2 s) in pmHG (p = 0.023, re = 0.768) and alHG (p = 0.016, re = 0.768), and this effect was larger in alHG compared with pmHG (interaction: p = 0.023, re = 0.737; Fig. 7B).
Discussion
Intracranial recordings revealed distinct profiles of spectral power across anatomically and functionally defined auditory cortical fields, both before, and during, auditory stimulation. Stimulus-related activity in the high gamma range was greater in pmHG (a primary field) than in alHG (a non-primary field; Brugge et al., 2009; Steinschneider et al., 2014). Lower frequency oscillations also differed across these areas, most markedly in the 7–10 Hz (“alpha”) range. In the absence of auditory stimulation (i.e., before stimulus onset), alpha power was stronger in alHG than in pmHG; this difference began to diminish following the onset of speech as alHG alpha power decreased. The suppression persisted until auditory stimulation ended, and could not be accounted for by a change in the exponent χ (“slope”) of the 1/fχ component of the spectrum alone (Gao, 2016; Podvalny et al., 2015).
It has been proposed that one function of alpha oscillations throughout the brain is to reduce cortical excitability, limiting spontaneous or task-irrelevant processing (Klimesch et al., 2007; Jensen and Mazaheri, 2010). One might expect such a diminution in activity to manifest as reduced high gamma power, a proxy for asynchronous neural spiking (Crone et al., 2001; Mukamel et al., 2005; Manning et al., 2009). The anatomical dissociation we observed between alpha-power suppression (more anterolateral) and high gamma-power increases (more posteromedial) implies that any such antagonistic relationship does not hold as a fixed ratio across all sites. One interpretation is that primary cortex is ready to process sensory input somewhat indiscriminately, reflected in low (but non-zero) spontaneous alpha activity, whereas non-primary areas are selectively engaged through suppression of ongoing alpha oscillations following the onset of stimuli with particular acoustic properties (de Pesters et al., 2016) or of greatest task-relevance (Gomez-Ramirez et al., 2011; Lakatos et al., 2016; Wöstmann et al., 2017). We observed such a dependency: alpha-power suppression for clear speech began earlier and was more pronounced than for spectrally degraded speech. Here we have shown that this effect, previously reported in scalp EEG (Obleser and Weisz, 2012), holds in non-primary auditory cortex. The extent to which the difference in alpha-power suppression arises directly from the acoustic properties of the stimuli as opposed to their intelligibility or the level of attentional resources deployed cannot be determined from the current data. Independent manipulations of these factors in future studies would be valuable in ascertaining the functional significance of local alpha power changes.
Alpha oscillations have also been linked with predictive coding accounts of brain function (Friston and Kiebel, 2009). Activity in low-frequency bands is thought to carry predictions of the causes of sensory data from higher to lower hierarchical levels (von Stein et al., 2000; Bastos et al., 2012; Fontolan et al., 2014; Chao et al., 2018), or to reflect the precision of such predictions (Bauer et al., 2014; Sedley et al., 2016; Auksztulewicz et al., 2017). The finding that alpha oscillations propagate from non-primary to primary visual cortex in monkeys (van Kerkoerle et al., 2014) and their dominance in infragranular layers containing dense descending projections (Maier et al., 2010; Buffalo et al., 2011; but see Haegens et al., 2015; Bastos et al., 2018; Bonaiuto et al., 2018; Halgren et al., 2018) are consistent with a feedback role. However, to adequately describe electrophysiological activity during auditory processing, predictive coding models will need to account for the differences we report between auditory fields both in baseline alpha power and in its stimulus-related suppression. Simultaneous laminar recordings from core and belt areas in nonhuman primates would be beneficial in isolating input and output layers at different hierarchical levels.
Linking the observed spectral signatures of primary and non-primary cortex to underlying cytoarchitectural, myeloarchitectural, and chemoarchitectural properties is complicated by the lack of a single agreed parcellation of the human superior temporal plane, and the variability of HG morphology across individuals (von Economo and Horn, 1930). Our pmHG sites probably belong to core regions termed Te1.1/Te1.0 (Morosan et al., 2001) or AI (Rivier and Clarke, 1997; Wallace et al., 2002) and the alHG sites to Te1.2 (Morosan et al., 2001) or ALA (Wallace et al., 2002). Regardless of nomenclature, the former likely had a more developed granular layer with denser myelinated input from thalamus, and the latter more extensive long-range connections to non-sensory cortices (Hackett et al., 2001, 2014; Munoz-Lopez et al., 2010). These properties may give rise to preferred timescales of activity corresponding to different spectral profiles (Honey et al., 2012; Murray et al., 2014). In terms of chemoarchitecture, the distribution of receptors and enzymes underlying cholinergic transmission in particular differs across the two fields (Hutsler and Gazzaniga, 1996; Zilles et al., 2002). This is relevant as the cholinergic system influences cellular excitability (Cox et al., 1994; Hsieh et al., 2000) as well as the stimulus coding capacity of a neural population (Minces et al., 2017; Schmitz and Duncan, 2018), and disrupting it pharmacologically can reduce alpha power (Osipova et al., 2003; Bauer et al., 2012; Eckart et al., 2016).
Alpha-power suppression was observed not only in HG. Proximal contacts on the superior temporal plane and gyrus showed similar activity patterns, with suppression decreasing with distance from the center of alHG. Whether these measurements reflect a single volume-conducted source of alpha activity or separate oscillators has yet to be established. A related observation is that the spectral profiles in posterolateral superior temporal gyrus and pmHG are more similar to each other than are profiles between the two subdivisions of HG. This is consistent with the identification of a posterolateral superior temporal auditory area with similar functional properties to and direct connections with core auditory cortex (Howard et al., 2000; Brugge et al., 2003; Nourski et al., 2013, 2014).
The dominance of posterior alpha sources in scalp recordings hinders detection of distinct alpha activity from the temporal lobe. Furthermore, alpha-power changes following presentation of brief auditory stimuli, such as clicks and syllables, are likely dominated by the broadband low-frequency component of the evoked response (also visible in Fig. 3). Such stimuli may be too short to elicit the sustained suppression we observed, which reached a maximum ∼500 ms after stimulus onset and continued for the duration of the sentence. Nonetheless, magnetoencephalography, which is more selectively sensitive to tangentially-oriented dipoles that dominate on the superior temporal plane than EEG (Hämäläinen et al., 1993), has revealed likely temporal-lobe alpha sources without establishing their precise location (Tiihonen et al., 1991; Lehtelä et al., 1997; Weisz et al., 2011; Leske et al., 2015; Magazzini et al., 2016). Depth electrode recordings from HG in humans are relatively rare, and analysis of such data has largely focused on the evoked LFP or on high gamma activity (Nourski and Howard, 2015). Relevant exceptions include a multicenter study of resting state oscillatory activity (Frauscher et al., 2018). In that study, alpha band peaks were present in activity from superior temporal gyrus but not HG, nor elsewhere on the posterior superior temporal plane. These findings are somewhat consistent with our prestimulus results but included little if any data from alHG. Another study, in which pmHG was sparsely sampled, did not consistently reveal changes in alpha activity in response to sound, although broadband low-frequency activity in one patient was suppressed in auditory association cortex on the lateral surface (Fontolan et al., 2014). Finally, Podvalny et al. (2015) noted stimulus-related alpha suppression beyond a 1/fχ slope change in intracranially-implanted auditory areas, but did not report anatomical details. One contribution of the current report is to functionally identify sites as primary versus non-primary auditory cortex in all subjects, based on known neurophysiological response properties.
For our main analyses we compared power prestimulus to that >500 ms after sentence onset, excluding transient evoked activity. We did not consider oscillatory phase, known to be important for perception (Busch et al., 2009; van Rullen, 2016) and possibly underpinning communication between brain regions (Palva and Palva, 2011). Indeed, a reduction in alpha power such as we observed in non-primary cortex may occur alongside increased inter-areal alpha-band phase synchrony that has been argued to support cognitive function such as working memory (Freunberger et al., 2008, 2009; Doesburg et al., 2009). A full account linking alpha activity in different cortical fields to auditory processing will need to take both power and phase into consideration.
In summary, we report strong evidence for a source of alpha oscillations in alHG that is suppressed during prolonged auditory stimulation. This suppression is more pronounced when subjects listen to clear than to degraded speech, and drops off with distance from alHG throughout superior temporal cortex. The suppression cannot be explained solely by a change in the exponent χ (“slope”) of the 1/fχ component of the spectrum, and has a different anatomical distribution to induced high gamma activity, which is strongest in pmHG. Theories concerning the generation and function of alpha oscillations should account for their differential manifestation and stimulus-sensitivity in primary and non-primary auditory cortex.
Footnotes
This work was supported by NIH Grants R01-DC04290 and UL1RR024979, and by the Hoover Fund to M.A.H.; by CIHR Grant MOP 133450 and by the Canada First Research Excellence Fund (CFREF) award to BrainsCAN at Western University to A.J.B and I.S.J.; and by a BrainsCAN postdoctoral fellowship (CFREF) to B.H. We thank Haiming Chen for assistance with data collection, and Maria Chait, Tim Griffiths, and two anonymous reviewers for helpful comments on previous versions of the paper.
The authors declare no competing financial interests.
- Correspondence should be addressed to Alexander J. Billig at ajbillig{at}gmail.com
This is an open-access article distributed under the terms of the Creative Commons Attribution License Creative Commons Attribution 4.0 International, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.