Even simple tasks rely on information exchange between functionally distinct and often relatively distant neuronal ensembles. Considerable work indicates oscillatory synchronization through phase alignment is a major agent of inter-regional communication. In the brain, different oscillatory phases correspond to low- and high-excitability states. Optimally aligned phases (or high-excitability states) promote inter-regional communication. Studies have also shown that sensory stimulation can modulate or reset the phase of ongoing cortical oscillations. For example, auditory stimuli can reset the phase of oscillations in visual cortex, influencing processing of a simultaneous visual stimulus. Such cross-regional phase reset represents a candidate mechanism for aligning oscillatory phase for inter-regional communication. Here, we explored the role of local and inter-regional phase alignment in driving a well established behavioral correlate of multisensory integration: the redundant target effect (RTE), which refers to the fact that responses to multisensory inputs are substantially faster than to unisensory stimuli. In a speeded detection task, human epileptic patients (N = 3) responded to unisensory (auditory or visual) and multisensory (audiovisual) stimuli with a button press, while electrocorticography was recorded over auditory and motor regions. Visual stimulation significantly modulated auditory activity via phase reset in the delta and theta bands. During the period between stimulation and subsequent motor response, transient synchronization between auditory and motor regions was observed. Phase synchrony to multisensory inputs was faster than to unisensory stimulation. This sensorimotor phase alignment correlated with behavior such that stronger synchrony was associated with faster responses, linking the commonly observed RTE with phase alignment across a sensorimotor network.
Oscillatory brain activity reflects fluctuations of neuronal excitability; that is, different phases of brain oscillations correspond to high- and low-excitability states (Steriade et al., 1996; Sherman and Guillery, 2002; Lakatos, 2005). Interactions between neuronal groups can be facilitated, or not, by whether their respective excitatory states (or phases) are optimally aligned for the exchange of information. Synchronized activity can lead to temporal windows of communication between task-relevant brain regions (Varela et al., 2001; Fries, 2005; Womelsdorf and Fries, 2006). But how do local oscillations become aligned to promote inter-regional communication? Such synchronization might occur through phase reset of ongoing oscillations (Lakatos et al., 2007; Mercier et al., 2013). Previous evidence has shown, for example, that auditory stimulation can reset the phase of ongoing oscillations in visual cortex, influencing the processing of simultaneous or subsequent visual stimulation (Lakatos et al., 2009; Fiebelkorn et al., 2011; Romei et al., 2012; Mercier et al., 2013). Such cross-sensory (or cross-regional) phase reset, which is a mechanism for early multisensory interactions (MSIs) in sensory cortex (Lakatos et al., 2007), might serve to synchronize oscillatory activity across a network of task-relevant brain regions. Greater synchronization might then facilitate the transmission of information and thus improve behavioral outcomes.
A number of studies have linked the phase of ongoing oscillations, particularly in the delta and theta frequency bands, with fluctuations in behavioral performance under conditions of unisensory stimulation (Lakatos et al., 2008; Busch et al., 2009; Stefanics et al., 2010; Drewes and VanRullen, 2011; Fiebelkorn et al., 2011, 2013a, b; Henry and Herrmann, 2012; Ng et al., 2012). Recent investigations from our laboratory have also indicated that the detection of a near-threshold target is influenced by cross-sensory phase reset (Fiebelkorn et al., 2011, 2013b). The present study is focused on the neural mechanisms underlying perhaps the most commonly observed behavioral correlate of multisensory integration: the speeding of response times under conditions of multisensory relative to unisensory stimulation, typically referred to as the multisensory redundant target effect (RTE; Schröger and Widmann, 1998; Molholm et al., 2002, 2006; Teder-Salejärvi et al., 2002; Murray et al., 2005; Talsma and Woldorff, 2005; Senkowski et al., 2006; Romei et al., 2007; Moran et al., 2008; Gingras et al., 2009). Whereas several models have been developed to explain this behavioral facilitation (Miller, 1982; Otto and Mamassian, 2012), none of these models have incorporated the potential role of neuro-oscillatory mechanisms in contributing to the multisensory RTE.
Here, electro-corticographic (ECoG) data were recorded from auditory and motor cortices while patients with epilepsy performed a simple detection task that included both unisensory (auditory and visual) and multisensory (audiovisual) stimuli. Our analysis revealed (1) visual-alone stimulus influences on oscillatory activity in auditory cortex and whether (2) phase alignment during multisensory stimulation was stronger than for the sum of unisensory stimulation. Further, we investigated (3) if multisensory stimulation leads to greater/faster local and inter-regional phase alignment and (4) if inter-regional phase alignment was linked to response times. Our results show that stronger delta-band phase alignment in auditory cortex is linked to stronger phase alignment across a sensorimotor network and that stronger synchronization leads to faster response time (RTs), with the fastest phase locking and the fastest RTs occurring under conditions of multisensory stimulation. These data provide compelling support for the notion that modulation of neuro-oscillatory activity, in the form of cross-sensory phase reset and sensorimotor phase coupling, plays a significant role in the multisensory facilitation of reaction times.
Materials and Methods
Data were collected from three patients implanted with subdural electrodes before undergoing presurgical evaluation for intractable epilepsy (P1: male of 46 years; P2: female of 18 years; P3 female of 38 years). Participants provided written informed consent, and the Institutional Review Boards of the Nathan Kline Institute, Weill Cornell Presbyterian Hospital, and The Albert Einstein College of Medicine approved the procedures. The conduct of this study was strictly in line with the principles outlined in the Declaration of Helsinki.
Electrode placement and localization.
Subdural electrode (stainless steel electrodes from AD-Tech Medical Instrument) placement and density were dictated solely by medical purposes. The precise location of each electrode was determined through nonlinear coregistration of preoperative structural MRI (sMRI), postoperative sMRI, and CT scans. The preoperative sMRI provided accurate anatomic information, the postoperative CT scan provided an undistorted view of electrode placements, and the postoperative sMRI (i.e., sMRI conducted while the electrodes were still implanted) allowed for an assessment of the entire coregistration process and the correction of brain deformation due to the presence of the electrodes. Coregistration procedures, normalization into MNI space, electrode localization, and image reconstruction were done through the BioImage suite software package (Lacadie et al., 2008) and results projected on the MNI-Colin27 brain (X. Papademetris, M. Jackowski, N. Rajeevan, H. Okuda, R. T. Constable, and L. H. Staib, BioImage Suite: an integrated medical image analysis suite, Section of Bioimaging Sciences, Department of Diagnostic Radiology, Yale School of Medicine).
Stimuli and task.
Auditory-alone, visual-alone, and audiovisual stimuli were presented equiprobably and in random order using Presentation software (Neurobehavioral systems). The interstimulus interval was randomly distributed between 750 and 3000 ms. The auditory stimulus, a 1000 Hz tone with a duration of 60 ms (5 ms rise/fall times), was presented at a comfortable listening level that ranged between 60 and 70 dB, through Sennheiser HD600 headphones; the visual stimulus, a centered red disk subtending 3° on the horizontal meridian, was presented on a CRT (Dell Trinitron, 17”) monitor for 60 ms, at a viewing distance of 75 cm. Patients maintained central fixation and responded as quickly as possible whenever a stimulus was detected, regardless of stimulus type (auditory-alone, visual-alone, or audiovisual). All participants responded with a button press, using their right index finger (for previous application of this paradigm to probe multisensory processing, see Molholm et al., 2002, 2006; Senkowski et al., 2006, 2007; Brandwein et al., 2011, 2013; Mercier et al., 2013). Each block included 100 stimuli, and the patients completed between 12 and 20 blocks. To maintain focus and prevent fatigue, patients were encouraged to take frequent breaks. The experimenters monitored eye position.
Intracranial EEG recording and preprocessing.
Continuous intracranial EEG (iEEG) was recorded using BrainAmp amplifiers (Brain Products) and sampled at 1000 Hz (low/high cutoff = 0.1/250 Hz). A subdural, frontally placed electrode was used as the reference during the recordings.
Off-line, trials with a button response falling between 100 and 750 ms poststimulus onset were selected, and corresponding iEEG was epoched from −1500 to 1500 ms either time locked to stimulus onset or to the button press. These epochs (±500 ms padding) then underwent artifact rejection. An adaptive procedure, based on standardized z-values calculated across time independently for each channel and adjusted for each dataset (participant), was applied to detect artifacts. Manual scanning of the raw data was performed to evaluate the quality of this procedure [final average number of trials across participants used for this analysis was as follows: audiovisual (AV) 194 ± 12; visual-alone (V) 181 ± 7; auditory-alone (A) 148 ± 15]. Detrended epochs were further preprocessed to remove line noise (60/120 Hz) using a discrete Fourier transform and high-pass (0.1 Hz) and low-pass (125 Hz) filtered using a two-pass, fourth-order Butterworth filter. Baseline correction was conducted over the entire epoch.
Electrodes implanted subdurally are highly sensitive to local field potentials (LFPs) and much less sensitive to distant activity. To further improve the spatial resolution and avoid far-field diffusion, LFPs were used to estimate the spatial derivative of the voltage axis (Perrin et al., 1987; Butler et al., 2011; Gomez-Ramirez et al., 2011; Mercier et al., 2013). A composite local reference scheme was applied in which the composite was defined by the number of immediate electrode neighbors on the horizontal and/or vertical plane (see Eq. 1). This number varied from 1 to 4 on the basis of the reliability of the electrical signal (i.e., electrodes contaminated by electrical noise or epileptic activity were not included). For instance, a five-point formula was applied when there were four immediate neighbors (grids), whereas a four-point formula was used when there were three immediate neighbors. This approach was used to ensure maximum representation of the local signal, independent of the reference, and minimum contamination through diffusion of currents from more distant generators (i.e., volume conduction). where Vi,j (or Vk) denotes the recorded field potential at the ith row and jth column (or kth position) in the electrode grid (or strip).
To compute ERPs, all re-referenced nonrejected trials were averaged both time locked to stimulus onset for each stimulus condition (auditory-alone, visual-alone, and audiovisual) and time locked to the button press in response to each stimulus condition (following auditory-alone, visual-alone, and audiovisual stimuli). To verify whether sensory ERPs represented a statistically significant modulation from baseline, poststimulus amplitudes (from 0 to 300 ms) were compared with baseline amplitude values (from −100 to 0 ms). This was done using a random permutation test as described previously (Mercier et al., 2013). For the motor ERPs, the same method was applied with baseline defined as the entire period, since motor-related evoked activity starts before the button press due to preparatory activity and lasts for a few hundred milliseconds after the button press.
Based on the ERP analysis, we defined the contacts of interest (COIs); that is, for each participant, we selected the channels with the highest auditory or motor activity. For the auditory ROI, the selection was based on the earliest and largest ERP modulation observed over auditory regions. For the motor COI, the observation of the strongest ERP phase reversal (with the first phase characteristic of a readiness potential) time locked to the response was used as a selection criterion.
To perform time-frequency decomposition, individual trials were convolved with complex Morlet wavelets, which had a width from three to seven cycles (f0/σf = 3 for 3–4 Hz; 5 for 5—6 HZ, and 7 for higher frequencies; Fiebelkorn et al., 2013b). The frequency range of these wavelets was 3–50 and 70–125 Hz (to circumvent the ambient 60 Hz artifact noise) increasing in 1 Hz steps, with convolution applied every 10 ms. Power and phase concentrations were computed based on the complex output of the wavelet transform (Tallon-Baudry et al., 1996; Roach and Mathalon, 2008; Oostenveld et al., 2011). To avoid any back-leaking from poststimulus activity into the prestimulus period, the baseline used for the time-frequency analysis was from −1000 to −500 ms.
Analysis of phase concentration and power.
To evaluate the presence or absence of systematic increases in phase concentration across trials, the phase concentration index (PCI; introduced as phase-locking factor in Tallon-Baudry et al., 1996, and also referred to as intertrial coherence in Makeig et al., 2002 and Delorme and Makeig, 2004) was computed as follows. The complex result of the wavelet convolution for each time point and frequency within a given trial was normalized by its amplitude such that each trial contributed equally to the subsequent average (in terms of amplitude). This provided an indirect representation of the phase concentration across trials, with possible values ranging from 0 (no phase consistency across trials) to 1 (perfect phase alignment across trials). To test for significant PCIs relative to stimulus onset, we used the same random permutation test procedure as used to assess ERP statistical significance (see above). A unique distribution was generated for each frequency.
To assess evidence for phase resetting, we determined whether or not significant changes in phase concentration occurred in the absence of changes in power (Shah et al., 2004). Event-related spectral perturbations were visualized by computing spectral power relative to baseline (i.e., after subtracting the baseline average power, the power value was divided by the mean of the baseline values). The significance of increases or decreases in power from baseline was calculated using the same statistical procedure as for PCI and ERP analyses.
For both the ERP and power analyses, poststimulus activity could be either positive or negative relative to baseline. Therefore a two-tailed threshold was used to determine statistical significance (p was considered significant if p ≤ 0.025 or p ≥ 0.975). For analysis of phase alignment, a one-tailed approach was used to determine statistical significance (p was considered significant if p ≤ 0.05) because we were specifically interested in identifying increases in poststimulus phase consistency.
To assess whether visually driven modulations interacted nonlinearly with the auditory response we applied the additive criterion model [AV vs (A + V); Stein and Meredith, 1993; Stein, 1998; Stanford et al., 2005; Avillac et al., 2007; Kayser et al., 2008; Mercier et al., 2013]; that is, we measured if the activity elicited in the multisensory condition differed from what would be expected from simple summation of the activity elicited by unisensory stimuli. Such nonlinear multisensory effects were then identified as either supra-additive or subadditive if the multisensory condition was, respectively, larger or smaller than the sum of unisensory conditions.
For this, a randomization method was used in which the average audiovisual response was compared with a representative distribution of the summed “unisensory” trials (see Senkowski et al., 2007; Mercier et al., 2013 for similar approaches). This distribution was built from a random subset of all possible summed combinations of the unisensory trials (baseline corrected), with the number of summed trials corresponding to the number of audiovisual trials. All randomization procedures (and therefore unisensory trials) were performed independently for each time point, and for the frequency analyses, for each frequency band. For tests of MSI effects in phase concentration, the unisensory trials were summed before being transformed through a wavelet convolution due to the nonlinearity of the procedure (Senkowski et al., 2006, 2007).
Measures of functional connectivity.
To measure communication between COIs, we computed phase-locking value (PLV; Tass et al., 1998; Lachaux et al., 1999), which is an index that represents the degree of phase synchrony between two signals. It measures, at a given time point, the variability across trials of the phase difference between two electrodes and is defined as follows: with n being number of trials and ϕ1 and ϕ2 being the phase measured at electrode 1 and 2, respectively.
PLV is indexed between 1, when the phase difference is consistent across trials, and 0 when the phases at the two electrodes are randomly distributed with respect to each other. To test for significant increases in PLV, angle differences across trials were subjected to the Omnibus test (independently along the time and frequency dimensions). The Omnibus test is an alternative to the more commonly used Rayleigh test, with the advantage that it tests for circular uniformity without making assumptions about the underlying distribution (Berens, 2009). A second series of statistical tests was conducted using the random permutation procedure proposed by Lachaux et al. (1999). For the latter test, single trials were randomly permuted such that trials at one COI did not match the trials at the other COI when PLV was computed; that is, the within-trial phase relationship was broken. After each permutation the corresponding PLV was computed. A surrogate distribution was created by repeating this procedure 1000 times, independently for each time point and frequency. Finally, if the observed PLV was >95% of the randomized value, it was considered significant, and the p value was calculated accordingly.
We next tested for nonlinear multisensory effects by comparing PLV for the multisensory condition against the sum of the unisensory conditions. To do so, we used a variant of the test described under multisensory statistics. First, all possible combinations of summed unisensory trials were computed for the two COIs (over auditory cortex and over motor cortex). Then they were transformed in the time-frequency space to obtain the phase estimate. Next, some of the single summed trials were randomly picked and the corresponding PLV computed; their amount is identical to the number of multisensory trials used to compute the observed PLV. This last procedure was repeated 1000 times to build a (A + V) distribution against which the observed multisensory PLV (two-tailed) could be compared statistically. PLV is a measure of phase synchrony, and as such it does not take into account the amplitude of the signal. To ensure that the effects observed were specific to phase, we further computed amplitude and power correlations. No systematic pattern emerged from these control analyses, either across participants and/or conditions.
Measures of coupling between COIs (PLV indices and multisensory statistics using the additive model) were applied both on the data time locked to the stimulus and to the data time locked to the response. The first approach is more sensitive to phase alignment between auditory cortex and motor cortex relative to the stimulus onset (i.e., less susceptible to variability in sensory response), whereas the second approach is more sensitive to phase alignment relative to the motor response.
Measures of correlation between phase alignment indices and response time correlation.
To investigate the functional implications of phase alignment, we assessed if there was any correlation between PCI and PLV and between PLV and RTs. Single trials were sorted per RTs (from the fastest to the slowest trials, using bins containing 10% of the trials for each iteration with an increment of 1) and running averages of PCI (recorded from auditory cortex), PLV (recorded between auditory and motor cortices), and RTs were computed. Each PCI and PLV, time locked to the stimulus, was computed at the latency corresponding to the highest value measured for each condition in the delta band. Finally, we compared running PCIs to PLVs and RTs to PLVs using the Pearson correlation. Last, to verify that correlation measures were not influenced by response time distribution, the same analyses were conducted after log-transform and with reciprocal transform. With both approaches, comparative results were obtained, confirming results obtained with the nontransformed data.
Control for multiple comparisons.
All p values were corrected (in the time dimension for ERP, and both the time and frequency dimensions for PCI, power, and PLV) using the false discovery rate procedure for dependent tests from Benjamini and Yekutieli (2001). This correction, a sequential Bonferroni-type procedure, is highly conservative and thus favors certainty (type II errors) over statistical power (type I errors; for consideration of different approaches to controlling for multiple comparisons, see Groppe et al., 2011). This approach is derived from Benjamini and Hochberg (1995) and is widely used to control for multiple comparisons in neuroimaging studies (Genovese et al., 2002).
All data analyses were performed in MATLAB (The MathWorks) using custom-written scripts, the Fieldtrip Toolbox (Oostenveld et al., 2011), and CircStat: A MATLAB Toolbox for Circular Statistics (Berens, 2009).
There were four central questions driving the current work: first, we assessed if visual stimulation modulated local auditory cortical activity and second if this leads to local multisensory interactions; third, we investigated inter-regional interactions between auditory and motor cortices by measuring phase synchronization in both unisensory and multisensory contexts; and finally, using correlation measures, we tested for relationships between local and inter-regional phase alignment and between inter-regional phase alignment and response times. The experiment (a simple stimulus-detection task) was performed by three patients implanted with frontotemporal electrode grids, which allow for both high temporal and high spatial sampling of the electrophysiological signal. For each participant, the two most relevant electrodes were selected via conjunction of functional and anatomical characteristics to best capture neuronal activity from auditory and motor cortices.
Hit rates were close to ceiling indicating that participants easily performed the task (hit rates: 97 ± 1%). RT data demonstrated the commonly observed multisensory RTE (Schröger and Widmann, 1998; Molholm et al., 2002, 2006; Teder-Salejärvi et al., 2002; Murray et al., 2005; Talsma and Woldorff, 2005; Senkowski et al., 2006; Moran et al., 2008; Gingras et al., 2009), with RTs to audiovisual stimuli (AV average across participant: 290 ± 25 ms) faster than RTs to either of the unisensory stimuli (A: 341 ± 34 ms and V: 349 ± 5 ms).
Primary and cross-sensory responses and MSI in auditory cortex
We first conducted analyses on the signal from the electrode that presented the largest and fastest auditory response. Most of the electrodes located along the lateral sulcus showed auditory-related signal modulation. For all participants, the strongest auditory-evoked potential (AEP) response, also occurring at the shortest latency, was positioned over the more posterior and superior region of the superior temporal gyrus (STG; Fig. 1A). Several intracranial studies have reported a close relationship between activity from primary and secondary auditory cortex, located in Heschl's gyrus, and the signal recorded over the posterior bank of the STG (Liegeois-Chauvel et al., 1991; Howard et al., 2000; Brugge et al., 2003; Yvert et al., 2005; Guéguin et al., 2007). For each participant, the selected electrode was therefore considered to be the one that best captured the earliest stages of auditory cortical processing (normalized MNI coordinates: x = 158, 157, 157; y = 113, 119, 116; z = 78, 82, 91, respectively, for Participants 1, 2, and 3). For both auditory-alone and audiovisual conditions, AEP was observed at selected electrode onset before 100 ms (Fig. 1B). For the visual-alone condition, no clear sharp ERP was observed; instead a low-amplitude slow response emerged from baseline at later latencies and showed only brief periods of significance relative to baseline (Fig. 1B). The nature of this cross-sensory effect on multisensory processing was next tested using the additive model (AV vs (A + V)). This comparison revealed subadditive effects in all participants, with the ERP to the multisensory condition smaller than would be expected from the sum of the unisensory responses.
To further characterize the mechanisms underlying these ERP responses, we conducted analyses in the time-frequency domain, with a particular interest in assessing whether increases in phase consistency were accompanied by increases in power; that is, we computed the PCI (see Material and Methods) across trials to reveal whether ongoing oscillations were reset after stimulus presentation. We then examined concomitant power to investigate whether phase reset of ongoing oscillations was related to a classic ERP effect, or alternatively, whether it was modulatory in nature, occurring in the absence of significant increases in power. For auditory-alone and audiovisual conditions, significant increases in both PCI (Fig. 1C) and power were observed. Conversely, for the visual-alone condition, increases in PCI, which were only observed in the lowest delta and theta frequency bands (respectively, 3–4 and 5–8 Hz; Fig. 1B), did not co-occur with significant changes in power (Fig. 2). In other words, the presentation of a visual-alone stimulus led to a reorganization of ongoing oscillatory activity in auditory cortex, with the phase of low frequencies reset without any detectible change in power. This profile (i.e., increases in phase concentration without increases in power) is indicative of a modulatory effect through phase reset (Shah et al., 2004; Lakatos et al., 2007).
We then wanted to assess whether cross-sensory phase reset in auditory cortex (demonstrated in response to a visual-alone stimulus) modulated the response to a simultaneously presented auditory stimulus. To investigate this question, PCI measured for the audiovisual condition was compared with what would be expected from the sum of the unisensory conditions (auditory-alone plus visual-alone). Consistent across all participants, the application of the additive model [AV vs (A + V); Stein and Meredith, 1993; Stein, 1998; Stanford et al., 2005; Avillac et al., 2007; Kayser et al., 2008; Mercier et al., 2013] revealed a multisensory supra-additive effect in the lower frequency bands (theta and delta; Fig. 1D). This result advocates for multisensory interactions because the phases of low-frequency oscillations in the multisensory conditions were more strongly reset than what would be expected from the sum of the unisensory conditions. Further supra-additive multisensory effects, consistent across participants, were found in mid-range frequencies (alpha and beta bands). However, these multisensory effects were not linked to PCI modulations observed for the visual-alone condition in the same frequency range.
Response-related activity in motor cortex
Next we sought to investigate the relationship between activity in auditory and motor cortices. To do so, we first looked for the electrode most representative of motor-related activity. For each patient, analyses performed in the time domain and in the time-frequency domain localized the highest amplitude response-related activity to the same electrode, which was the case regardless of condition. In the time domain, the ERP time locked to the button press was characterized by a slow ramping, starting a couple of hundred milliseconds before the button press and peaking at approximately the time of motor response, then reversing in polarity. This typical Bereitschafts potential profile (or readiness potential; Satow et al., 2003) was confirmed at the single-trial level as shown in Figure 3B, where signal amplitude is depicted at the single-trial level, sorted as a function of response time. For all conditions, the plots show fluctuations of the signal time locked to the button press, but not time locked to stimulus onset. Additionally, time-frequency analysis revealed the typical spectral characteristics of a motor response (Crone et al., 1998a, b; Aoki et al., 1999; Ball et al., 2008; Miller et al., 2009, 2012; Ruescher et al., 2013), consisting of a strong synchronization of low frequencies (<10 Hz) and high gamma band (>70 Hz) concurrent with a desynchronization in the beta band (Fig. 3B). For all participants, these temporal motifs of motor-related activity were observed in an electrode located along the most dorsal portion of the precentral sulcus for all participants (Fig. 1A; normalized MNI coordinates: x = 129, 135, 143; y = 98, 110, 110; z = 136, 133, 129, respectively, for Participants 1, 2, and 3). The present anatomical-functional observations are in full agreement with reports from other human intracranial studies (Crone et al., 1998a, b; Aoki et al., 1999; Ball et al., 2008; Miller et al., 2009, 2012; Ruescher et al., 2013).
Phase locking between auditory and motor cortices
After characterizing sensory and motor-related activity over auditory and motor cortices, we aimed to investigate their possible functional link. We estimated functional connectivity across trials using the PLV, which indexes phase consistency between two signals (Tass et al., 1998; Lachaux et al., 1999). For audiovisual and auditory-alone conditions, analysis of PLV, time locked to the stimulus, revealed a strong and significant increase of phase synchronization between auditory and motor cortices in the low theta and delta bands, occurring between presentation of the stimulus and the button-press response (Fig. 4A). For the visual-alone condition, there was a weaker but still significant PLV increase in two participants in the same theta and delta bands. These results demonstrate a synchronization of activity between auditory and motor cortices, which was stronger when an auditory input was presented. The same analysis conducted on the data, time locked to the response instead, showed an increase in PLV in the low-frequency band (Fig. 5A) that was weaker than that time locked to the stimulus, and only reached significance in the case of one participant. Thus coupling between auditory and motor cortices is more consistently locked to stimulus presentation than to the motor response.
Subsequently we investigated if there was a multisensory effect on synchrony, as defined by the additive model. We assessed if the increase in PLV from the multisensory condition was comparable to the sum of PLV from the unisensory conditions. The analysis of the data time locked to the stimulus revealed nonlinear multisensory interactions by showing a supra-additive effect for all participants (Fig. 4B); that is, in the multisensory condition, phase synchronization between auditory and motor cortices was stronger than expected from the sum of the unisensory conditions. Applying the same analysis to the data time locked to the motor response did not show systematically significant multisensory interactions (Fig. 5B). This second analysis lends further support to the notion that additive effects found in coupling when the data are analyzed time locked to the stimulus reflect differences in coupling time, with faster synchronization in the audiovisual condition.
Correlation between phase alignment indices and response time
First, to link phase synchronization between auditory and motor cortices, with modulation of ongoing oscillations in auditory cortex, we performed a correlation measure between those two indices by computing sliding PCIs and PLVs using 10% of the trials binned by response times. The results in Figure 6A reveal a positive correlation between PCI and PLV along trial bins (P1: all rs > 0.33, all ps < 0.001; P2: AV and V: all rs > 0.67, p < 0.001; A: r = 0.15, p < 0.03; P3: AV and A: r > 0.23, p < 0.001; V: r = 0.04, p = 0.29); that is, the stronger the phase alignment was in auditory cortex, the higher phase locking was between auditory and motor cortices. To extend our investigations we also assessed the behavioral relevance of inter-regional phase synchronization. This analysis was done following the same approach: PLVs, time locked to the stimulus, were computed for trials, sorted, and then binned by response time. The results, depicted in Figure 6B, show a negative correlation between PLV and response time, which was further confirmed by running a Pearson correlation (Participant 1: all rs < − 0.5, all ps < 0.001; Participant 2: AV and A: r < −0.5, p < 0.001, V: r < −0.1, p < 0.1; Participant 3: all rs < −0.3, all ps < 0.001). In summary, the greater the phase alignment in auditory cortex, the highest the synchrony is between auditory and motor cortices and the faster the response times. These results thus link synchronization across a sensorimotor network with the speeding of response times.
Multisensory stimulation, even when fully redundant, facilitates behavior in simple reaction-time tasks (Schröger and Widmann, 1998; Molholm et al., 2002, 2006; Teder-Salejärvi et al., 2002; Murray et al., 2005; Talsma and Woldorff, 2005; Senkowski et al., 2006; Romei et al., 2007; Moran et al., 2008; Gingras et al., 2009). In the present study, we investigated the neural basis of this speeding. Studies demonstrate that phase reset of ongoing oscillations, via cross-sensory inputs (e.g., auditory inputs into visual cortex), is central to multisensory interactions in sensory cortex (Lakatos et al., 2009; Fiebelkorn et al., 2011; Romei et al., 2012; Mercier et al., 2013). Here, we asked whether such local phase alignment might also lead to inter-regional phase alignment across a sensorimotor network, and we hypothesized that such inter-regional synchronizations might give rise to a well established behavioral consequence of multisensory stimulation: the so-called redundant target effect. The present data reveal that multisensory stimulation indeed leads to greater local and faster inter-regional phase alignment, and that greater phase alignment across a sensorimotor network is linked to faster response times.
Visually driven modulation of ongoing activity in auditory cortex and multisensory interactions
Several noninvasive EEG and MEG studies have reported early latency multisensory interactions that localized best to auditory cortex (Foxe et al., 2000; Murray et al., 2005; Mishra et al., 2007; Raij et al., 2010; Thorne et al., 2011), consistent with findings from human neuroimaging (Foxe et al., 2002) and electrophysiological recordings in nonhuman primates (Schroeder and Foxe, 2002). Here, subdural recordings confirmed the presence of both early multisensory interactions (<200 ms) and visually driven activity in human auditory cortex. Notably, in stark contrast to the classical responses evoked by auditory-alone and audiovisual stimuli, the response in auditory cortex to visual-alone stimulation was dominated by low-amplitude, slow oscillatory activity (Fig. 1B). Further analysis showed that this visually driven modulation of ongoing activity over auditory cortex was attributable to the phase reset of oscillations in the delta and theta bands (Fig. 1C), consistent with earlier work in animals that showed that cross-sensory inputs can modulate neuronal firing by resetting the phase of ongoing oscillatory activity, without increasing signal power (Lakatos et al., 2007; Kayser et al., 2008). It also supports findings from recent noninvasive EEG work that described visually driven modulatory effects within low-frequency oscillations localized (via dipole source-modeling) to auditory cortex (Thorne et al., 2011). The present data go further yet to reveal the presence of nonlinear multisensory effects in the same low frequencies (i.e., delta and theta).
Further, supra-additive multisensory effects, consistent across participants, were also found in the alpha and beta bands, with PCI for the multisensory condition stronger than for the sum of unisensory conditions (Fig. 1D). Moreover, absence of significant PCI increases in the visual-alone condition in these bands raises questions as to the origins of these multisensory effects. We propose two possible explanations. First, it may be that PCI modulations in the visual-alone condition are simply too weak to survive the stringent correction for multiple comparisons applied here. Alternatively, this supra-additive effect may rely on a different mechanism than the one observed in the low frequencies (i.e., visually driven cross-modal phase reset), one that perhaps involves another anatomofunctional pathway. We reported a very similar observation in a previous study where auditory-driven modulation of activity in visual cortex was similarly linked to multisensory interactions (Mercier et al., 2013), even though these multisensory effects were not always associated with detectable cross-modal inputs in the auditory-alone condition. The source of these alpha and beta effects will bear further investigation in future studies.
Phase alignment between auditory and motor cortices and the influence on MSI
Recurrent interactions among multiple cortical and subcortical regions are thought to underlie even the simplest of cognitive tasks. An abiding question is, just how do distant neuronal populations involved in a given task communicate with each other? Coordination of fluctuations in neuronal oscillatory activity have been proposed to mediate such interactions by providing optimal temporal windows of communication between distant neuronal populations involved in a given task (Engel et al., 2001; Varela et al., 2001; Fries, 2005). Synchronized activity between distant brain regions has now been demonstrated during a wide range of processes including perception (Sehatpour et al., 2008; Hipp et al., 2011), attention (Buschman and Miller, 2007; Womelsdorf et al., 2007; Gregoriou et al., 2009; Gray et al., 2015), memory (Palva et al., 2010; Liebe et al., 2012; Burke et al., 2013; Jutras et al., 2013), and sensorimotor coordination (Bressler et al., 1993). Bressler et al. (1993), for example, observed synchronous activity between striate and motor cortex while monkeys performed a go/no-go task, but only when the monkeys had to respond to the visual target (i.e., during go trials).
Here, we find that oscillatory phase alignment between auditory and motor cortices increased during the time period between stimulus presentation and the button-press response. This suggests active communication between two major nodes of the sensorimotor network, recruited to perform the task at hand. Moreover, we observed supra-additive multisensory effects on phase synchronization between auditory and motor cortices that were due to faster synchronization in the multisensory condition. Based on a significant correlation between phase reset in auditory cortex and subsequent phase alignment between auditory and motor cortices, we propose that stronger multisensory-driven local phase alignment leads to faster inter-regional phase alignment between auditory and motor cortices.
Oscillatory activity, MSI, and behavior
Finally, we examined the relationship between auditory and sensorimotor phase alignment and behavior. Correlation analysis revealed that greater phase alignment in the delta band (i.e., the frequency band that showed maximal sensorimotor phase alignment) was significantly related to faster responses. This suggests that speeding of responses commonly observed in multisensory redundant target tasks is due, at least partially, to faster multisensory-related increases in phase alignment between sensory and motor cortex.
Using the same experimental design and scalp-recorded EEG, our group previously identified a relationship between beta power over frontal, left central, and right occipital scalp regions and response times (Senkowski et al., 2006). Multisensory effects were found in the same frequency band and scalp regions, suggesting a link between multisensory interactions and multisensory response time facilitation. While in the present intracranial study we did not observe consistent effects in beta power, it must be noted that the highly localized LFPs were recorded over circumscribed brain regions. The present results, therefore, do not capture all task-related activity, which has been observed in neuroimaging studies to involve a large and distributed network (for reviews, see Martuzzi et al., 2007 as an example of fMRI study based on the same experimental design and Senkowski et al., 2007 and Koelewijn et al., 2010).
Brain oscillations are hierarchically organized, with the phase of low-frequency oscillations modulating the amplitude of higher frequencies (Lakatos et al., 2005; Maris et al., 2011; van der Meij et al., 2012). Further, phase-amplitude coupling has been reported in human intracranial recordings for spatially distributed electrodes (Maris et al., 2011; van der Meij et al., 2012). One could hypothesize that phase alignment at lower frequencies, as observed here, might be linked to multisensory effects in the beta band at other network nodes, such as the superior parietal lobule, which is known to be involved in MSI (Molholm et al., 2006; Moran et al., 2008).
Alternatively, analysis of PCI revealed supra-additive effects in the alpha and beta bands that were not associated with statistically significant cross-sensory phase reset in the visual-alone condition, unlike what was observed in the low frequencies. These two observations question the role of the different frequency bands in neuronal interactions. Several studies that investigated inter-regional synchronization in different cognitive contexts revealed a parallel between functional hierarchy and distinct frequency bands (Buschman and Miller, 2007; Buschman et al., 2012; Bastos et al., 2015). For example, Bastos et al. (2015) investigated this question using dense subdural electrode coverage in nonhuman primates in conjunction with anatomical tracing. They demonstrated that distinct frequency bands support feedforward and feedback influences, with theta bands responsible for the former, whereas beta bands were implicated in the latter. Here, multisensory effects found in the delta-theta bands would therefore be supposed to reflect feedforward processing, driving cross-sensory phase reset and phase synchrony between auditory and motor cortices, whereas multisensory effects in the alpha-beta bands would be linked to feedback from higher order multisensory processing zones.
Importantly, a recent MEG study conducted by Thorne et al. (2011) reported that cross-sensory phase reset of low frequencies in auditory cortex partly accounted for response time variability. The present results extend this finding by linking response times to phase alignment between auditory and motor cortices and by linking multisensory interactions in auditory cortex (i.e., cross-sensory phase reset) to inter-regional sensorimotor communication. Phase reset in sensory cortex appears not only to promote processing of incoming input by increasing the efficiency of the sensory system (as demonstrated by the inverse effectiveness effect; Lakatos et al., 2007), but also by enhancing neuronal communication with other task-related brain regions.
As with all studies using ECoG in humans, response profiles over putatively similar cortical regions show a high degree of interindividual variability (Edwards et al., 2005, 2009; Molholm et al., 2006, 2014; Bidet-Caulet et al., 2007; Besle et al., 2008; Sehatpour et al., 2008; Sinai et al., 2009; Vidal et al., 2010; Butler et al., 2011; Gomez-Ramirez et al., 2011; Bahramisharif et al., 2013; Mercier et al., 2013), very much the same as is observed from noninvasive scalp recordings (Foxe and Simpson, 2002). Much of this variability is due to heterogeneity of underlying cortical geometry across individuals (Stensaas et al., 1974; Rademacher et al., 1993), such that electrodes will almost certainly be at varying orientations regarding generators of primary interest (Kelly et al., 2008). In turn, there is large interindividual variability in the timing of neural transmission and extent of the cortical network that is activated, even for very simple tasks.
It would also be preferable to engage a larger participant cohort. Sample sizes in ECoG studies are often limited by electrode coverage, which is solely dictated by medical needs. Such surgeries are relatively rare, patients often present with significant cognitive compromise that precludes participation, and others will be excluded because of contamination of recordings by epileptic activity. Nonetheless, clear commonalities can be observed and the fact that similar phenomena can be established as statistically robust at the individual participant level across three unique individuals provides a large degree of confidence that we are observing the same underlying mechanisms.
As mentioned, a limitation of ECoG studies concerns spatial coverage, since electrode implants cannot reasonably cover all brain areas involved in a given task. Therefore, exhaustively mapping a given functional network requires large cohorts. A recent study estimated that 50–100 patients (with depth electrodes) would be required to reach 90% coverage depending on the atlas parcellation used (Arnulfo et al., 2015). Also, it is important to clarify that we do not assume that the redundant target effect can be exclusively ascribed to phase-synchronization mechanisms in sensory cortices. Neuroimaging has implicated an extensive network of regions involved in audiovisual integrative processing, even for the very basic stimuli and simple task used herein (Martuzzi et al., 2007). Our aim here is simply to demonstrate that brain oscillatory activity is one mechanism participating in production of the RTE.
Summary and conclusions
Recordings over auditory and motor cortices during a reaction time task revealed (1) visually driven cross-sensory delta band phase reset in auditory cortex; (2) supra-additive multisensory interaction effects [AV > (A + V)] on phase alignment in the delta and theta bands; (3) phase synchronization between auditory and motor cortex in the delta band for all stimulation conditions, with faster sensorimotor inter-regional phase alignment in the multisensory condition; and (4) faster responses with stronger inter-regional phase synchronization. These data thus suggest that multisensory stimulation leads to stronger (i.e., more consistent) local phase alignment in sensory cortex, which in turn leads to faster inter-regional phase synchrony between sensory and motor cortex. This increase in local phase alignment allows for more rapid transfer of information across a sensorimotor network and, consequently, faster responses.
This work was primarily supported by a grant from the U.S. National Science Foundation to J. J. F. (BCS1228595). Part of the data analysis was performed using the Fieldtrip toolbox for EEG/MEG-analysis, developed at the Donders Institute for Brain, Cognition and Behavior (Oostenveld et al., 2011). We thank the three patients who donated their time and energy with enthusiasm at a challenging time for them.
The authors declare no competing financial interests.
- Correspondence should be addressed to either Manuel R. Mercier or John J. Foxe, The Sheryl and Daniel R. Tishman Cognitive Neurophysiology Laboratory, Children's Evaluation and Rehabilitation Center, Department of Pediatrics, Albert Einstein College of Medicine & Montefiore Medical Center, Van Etten Building, Wing 1C, 1225 Morris Park Avenue, Bronx, NY 10461. or