Crossmodal sensory interactions serve to integrate behaviorally relevant sensory stimuli. In this study, we investigated the effect of modulating crossmodal interactions between visual and somatosensory stimuli that in isolation do not reach perceptual awareness. When a subthreshold somatosensory stimulus was delivered within close spatiotemporal congruency to the expected site of perception of a phosphene, a subthreshold transcranial magnetic stimulation pulse delivered to the occipital cortex evoked a visual percept. The results suggest that under subthreshold conditions of visual and somatosensory stimulation, crossmodal interactions presented in a spatially and temporally specific manner can sum up to become behaviorally significant. These interactions may reflect an underlying anatomical connectivity and become further enhanced by attention modulation mechanisms.
In every day experience, the brain must integrate information from multiple sensory modalities to create a unified sensory percept (Stein et al., 1993; Ghazanfar and Schroeder, 2006). This ability confers a clear behavioral advantage such as enhanced object identification and saliency, increased detection rate, and reduction in the ambiguity of a perceived sensory event (Calvert et al., 2000; Frassinetti et al., 2002; Bolognini et al., 2005). The perception of objects and the detection of environmental events are improved by focusing attention on the location of stimuli (Wright, 1998; McDonald et al., 2003). However, the mechanisms associated with the detection of crossmodal stimuli presented under subthreshold conditions are not known nor often investigated.
Numerous studies have demonstrated strong crossmodal interactions within the visual–tactile (Foxe et al., 2002) and visual–auditory domains (Molholm et al., 2004). Work by Macaluso et al. (2000) demonstrated that tactile and visual stimuli presented in close spatial and temporal proximity influence behavioral performance outcomes and also modulate activity [as measured with functional magnetic resonance imaging (fMRI)] within brain areas typically assumed to be strictly unimodal.
It has been traditionally assumed that crossmodal integration occurs within “heteromodal” higher-order areas after the processing of sensory signals within early, unimodal stages of the brain (Felleman and Van Essen, 1991) [but see also Schroeder and Foxe (2005) for discussion]. However, anatomical evidence confirms the existence of direct connections between primary sensory areas within the nonhuman primate brain (Falchier et al., 2002; Rockland and Ojima, 2003; Cappe and Barone, 2005), suggesting that this view may be an oversimplification. More recent evidence has shown that activity within primary sensory areas (such as primary visual cortex) can be modulated by crossmodal integration during the perception of visuo-auditory illusions (Watkins et al., 2006) and tasks associated with attention modulation (Jack et al., 2006). Along the same lines, contrasting tasks that manipulate overt and covert attention (Macaluso et al., 2005) can help uncover multisensory interactions and reveal underlying “hard-wired” mechanisms whose activity can be modulated by attention under certain behavioral conditions.
In this study, we investigated crossmodal interactive effects by presenting subthreshold visual stimulation [using transcranial magnetic stimulation (TMS)] combined with subthreshold somatosensory afferences [using peripheral electrical stimulation (PES)]. Multiple interstimulus intervals (ISIs) were tested. By stimulating the occipital cortex directly with TMS to induce phosphenes (rather than presenting an external visual stimulus), we could assess changes in visual cortex physiology as a function of its level of excitability. In addition, the role of direct neural pathways was tested by randomly presenting a sensory stimulus to one hand or another. Finally, the effect of spatial congruency effects in these crossmodal sensory interactions was assessed by testing the hands in crossed and uncrossed postures.
Given that spatial–temporal congruency is critical for crossmodal interactions and that attentional shifts can modulate these responses, a “hard-wired” sensory interactive mechanism would predict that crossmodal interactions might be revealed even when using subthreshold sensory stimuli. Specifically, we used subthreshold stimuli to investigate whether appropriately delivered crossmodal afferences might summate to generate a reportable percept.
Materials and Methods
A total of six right-handed, healthy subjects (three males; mean age, 21.3) participated in the study. All were previously naive to phosphene stimulation induced by TMS as well as to the purpose of the study. Only subjects who reported robust, consistent and stable phosphenes were enrolled (determined in an initial screening and training session before the actual experiment). Based on these criteria, three subjects were excluded because of the fact that they could not perceive nor report phosphenes reliably. Three other subjects failed to complete all of the testing sessions of the experiment and were also excluded from the final analysis. Written informed consent was obtained from each participant, and the study was approved by the institutional review board of Beth Israel Deaconess Medical Center.
Participants wore a specially designed blindfold for a period of ∼2.5 h to ensure that they remained in total darkness throughout the experiment. Single-pulse TMS was delivered over the left occipital cortex using a Magstim Super Rapid stimulator (Magstim Company, Whitland, UK) connected to a standard figure-eight-shaped coil (70 mm diameter). First, the site of occipital cortex stimulation was determined and defined as the site over which TMS triggered a visual phosphene at the lowest possible stimulation intensity. The TMS target was then kept constant throughout the experiment with the aid of a frameless stereotaxic tracking device (Brainsight Frameless version 1.5; Rogue-Research, Montreal, Quebec, Canada) (Gugino et al., 2001). In general, the targeted brain region was 2 cm dorsal and 0.6 cm lateral to the inion (left hemisphere) overlaying the calcarine sulcus corresponding to primary visual cortex (V1). After identification of the optimal target position, the phosphene threshold (PT) was defined using the method of limits and was defined as the lowest stimulus strength evoking the perception of a phosphene in three of five consecutive trials (Kammer et al., 2001; Stewart et al., 2001; Fernandez et al., 2002) at the beginning and the end of each experimental block.
PES was applied to either the right or left index finger (pulse duration, 200 μs) using a Digitimer (Hertfordshire, UK) DS-7A electrical stimulator. As with PT, somatosensory threshold was determined using a method of limits and defined as the lowest stimulation intensity giving rise to a detectable stimulus in three of five consecutive trials.
Throughout the experiment, visual and somatosensory stimuli were delivered at subthreshold intensity. Specifically, left occipital TMS was delivered to the optimal phosphene-inducing locus but at an intensity of 80% of PT (mean subject PT, 41.2 ± 11.7% SD of maximal stimulator output), and PES was applied at 80% of the subject's somatosensory threshold (mean PES right hand, 2 ± 0.2 mA SD; left hand, 2.2 ± 0.2 mA SD). Given that PES was randomly delivered across trials at subthreshold intensity and that the electrodes were attached to both hands at all times, subjects could not anticipate nor identify which hand, if any, was being stimulated during the experiment.
Subthreshold occipital TMS was delivered alone (i.e., in the absence of PES; control condition) or paired with PES delivered to the right or left hand in the uncrossed position (experimental conditions 1 and 2) (Fig. 1, Uncrossed), or the crossed position (experimental conditions 3 and 4) (Fig. 1, Crossed). The timing of both the TMS and PES were triggered under computer control (PsyScope version 1.2.5). Subjects underwent experimental blocks lasting ∼13 min in which PES was applied always before TMS delivery and at randomized ISIs (40, 60, 80, or 100 ms) for a total of 80 trials. Trials were separated by at least 10 s to minimize any carry-over effects of TMS on cortical excitability. During each block, 25% of the trials consisted of TMS alone (i.e., without PES). Thus, 40 stimuli were elicited at each possible ISI and for each PES condition for a total of 240 stimuli. The participants underwent four blocks during each session and a 15 min break was allotted between blocks.
After each TMS pulse, subjects were required to report verbally whether or not they perceived a phosphene, and if so, describe its location. Subjects were also instructed to report whether they perceived any tactile sensations as an additional means to control for subthreshold somatosensory (PES) stimulation. Statistical analysis was performed using Stata (College Station, TX) statistical software (version 8.0). As subject responses corresponded to a binary outcome, we used a logistic regression model in which the dependent variable was PHOSPHENE DETECTION (yes or no) and the independent variables were EXPERIMENTAL CONDITION (four PES conditions) and ISI (40, 60, 80, and 100 ms). We report post hoc comparisons reaching significance after Bonferroni correction. The control condition “No PES” (TMS alone) was not tested at different ISIs and thus not included in the main regression model. For this reason, we performed a modified logistic regression model in which we included only the covariate CONDITION (control condition) and the four experimental conditions.
During PT determination, all subjects reported phosphene percepts within the contralateral (i.e., right) visual field. At no time during the experiment did the subjects report feeling any tactile stimulation to their hands. Figure 2 shows the rate of phosphene detection for each PES condition and ISI tested. It is evident that trials pairing TMS with PES to the right hand led to a striking increase in the rate of reporting the perception of phosphenes. The largest crossmodal enhancement effect was apparent after subthreshold stimulation of the right index finger at an ISI of 60 ms and with the hands in the uncrossed position.
Using a multivariate analysis, the logistic regression model detected a significant interaction term between [experimental condition] and [ISI] (z = −2.65; p = 0.008). We then performed individual 2 × 4 tables [phosphene (yes/no) and ISIs (40, 60, 80, and 100 ms)] for each experimental condition. This analysis revealed that there was a significant interaction effect for the conditions right uncrossed (p < 0.001) and right crossed (p < 0.001) and no difference in phosphene detection across the different ISIs for the conditions left uncrossed (p = 0.86) and left crossed (p = 0.33).
We then compared whether there was a difference in experimental condition for each ISI tested after a similar analysis [2 × 4 table, phosphenes (yes/no) and experimental condition (left crossed, left uncrossed, right crossed, and right uncrossed)]. This analysis showed that there was a significant difference in the rate of phosphene detection for each ISI across the different experimental conditions (p < 0.001 for all ISIs tested), indicating that phosphene detection rate was always higher for the conditions in which the right hand was stimulated compared with the left.
We then analyzed whether there was a difference in phosphene detection between right uncrossed and right crossed conditions. This analysis revealed a trend toward a significant difference for the ISIs of 40 ms (p = 0.074) and 60 ms (p = 0.083) but not for the ISIs of 80 ms (p = 0.21) and 100 ms (p = 1). For the ISIs of 40 and 60 ms, the phosphene detection rate was highest when the hands were in the uncrossed position.
Finally, no significant difference in phosphene detection rate was found when the conditions crossed and uncrossed for the left hand were compared at all of the ISIs tested (p > 0.2 for ISIs of 40, 60, 80, and 100 ms). Comparing the left hand conditions with no PES revealed no difference across these conditions (z = −1; p = 0.32). A similar analysis of the right hand showed a highly significant difference across right hand conditions and no PES (z = −14.74; p < 0.001).
The results of this study suggest that a subthreshold somatosensory stimulus that is spatially and temporally specific leads to a crossmodal visual enhancement effect. Specifically, subthreshold somatosensory stimuli presented in a spatiotemporal congruent manner modified visual cortex excitability in such a way that TMS delivered to visual cortex evoked the perception of a visual phosphene using a stimulation intensity that was previously subthreshold for phosphene induction.
These results are in general agreement with previously reported crossmodal spatial congruency effects. For example, Macaluso et al. (2000) used event-related fMRI to demonstrate that tactile stimulation enhanced activity within unimodal visual cortical areas, but only when it was on the same side as a visual target. In our study, the crossmodal enhancement effect was maximal when the peripheral somatosensory stimulus preceded the occipital TMS by 60 ms. Eimer and coworkers used event-related potentials (ERPs) to investigate crossmodal links in spatial attention between vision, audition, and touch under conditions in which attention was directed to a specific location within one (primary) modality, whereas stimuli in another (secondary) modality were to be ignored regardless of their position (Eimer, 2001). In this study, ERP effects of spatial attention were observed not only in the primary modality, but also for secondary modality stimuli, thus revealing crossmodal links in spatial attention. This effect occurred at relatively early sensory-specific ERP components between 100 and 200 ms after stimulus. Beyond 200 ms, ERPs to secondary modality stimuli were scarcely affected by the current focus of attention within another modality (Eimer, 2001).
The crossmodal effect in our study may reflect a pathway mediating the integration of crossmodal sensory signals. A feedback network involving primary and secondary somatosensory areas, parietal multimodal and ultimately secondary and primary visual areas has been proposed as an anatomical substrate mediating visuo-tactile crossmodal interactive effects (Macaluso et al., 2005). In fact, recent anatomical evidence from the macaque brain suggests that such heteromodal connectivity between somatosensory and visual areas does indeed exist (Cappe and Barone, 2005). However, in our study, optimal crossmodal interactions were evident at an interstimulus delay of 60 ms. This relatively rapid modulatory response time would not be consistent with a top-down mechanism acting through parietal cortical areas. A review of the response latencies across primate visual areas (both subcortical and cortical) supports this argument (Bullier and Nowak, 1995). Given that tactile information typically arrives to primary sensory cortex within 20 ms, the remaining 40 ms would be responsible for modulating activity within the visual cortex. Response latencies in parietal areas have typically been reported as much longer; thus, this relatively rapid time period would be more consistent with a bottom-up mechanism (for example from subcortical areas such as the putamen and superior colliculus) or even possibly a direct interaction between early somatosensory and visual cortical areas.
By comparing the modulatory effects of placing the hands in the crossed and uncrossed position, we investigated the contribution of spatial attention in crossmodal interactions. Crossing the hands over the midline of the body has been shown to lead to a general decrease in performance during crossmodal visuo-tactile tasks and also depend on the specific task requirements (Macaluso et al., 2005). In our study, the crossmodal enhancement effect was still present when the hands were in the crossed position. Crossmodal enhancement effects have been shown to arise even if the tactile cues are task irrelevant and do not predict the location of the visual targets (i.e., follow the hand position), suggesting an exogenous (stimulus-driven) attentional mechanism (Holmes et al., 2006). In our experiment, subthreshold stimuli reveal a crossmodal effect that follows the hand. Thus, when subjects are unaware of the presence of a crossmodal interactive stimuli, it appears that the brain defaults to sensory interactions based on hard-wired connections (perhaps with a greater proprioceptive component) rather than on attention based spatial registering [as has been reported previously (Macaluso et al., 2000)].
The intensity and duration of the stimuli we applied is significantly inferior to the parameters reported in crossmodal modulation literature. Stimuli that are normally undetectable modulate cortical activity (Libet et al., 1967; Blankenburg et al., 2003), indicating that subthreshold afferences are indeed processed to a certain extent. Blankenburg et al. (2003) used fMRI to characterize cortical processing during imperceptible electrical finger stimulation and reported a BOLD (blood oxygenation level-dependent) signal decrease (focal “deactivation”) localized to the hand area of primary somatosensory cortex. We interpreted the net cortical deactivation as a reduced level of baseline activity suppressing noninformative sensory “noise.” Neurophysiological evidence in cat superior colliculus has shown that stronger crossmodal interactions can occur when feeble unimodal sensory stimuli are presented (Meredith and Stein, 1986). This “inverse effectiveness” principle may account for the increased phosphene perception we describe. Moreover, our results may represent a release of these modality-specific areas of a baseline inhibitory effect, thus allowing for sensory interactions to approach awareness under specific conditions of spatiotemporal congruency. Presumably, once the combination of sensory stimuli is deemed to be behaviorally relevant, attentional mechanisms [possibly through parietal feedback modulatory connections (Macaluso et al., 2000)] may then enhance these crossmodal sensory interactions.
Previous studies have addressed the effect of short-term visual deprivation on visual cortex excitability, which was measured through PTs. Specifically, after 45 min of complete visual deprivation (blindfolded), normally sighted subjects show a significant decrease in PT (that is, an increase in cortical excitability), as well as complete recovery to baseline levels after re-exposure to light (Boroojerdi et al., 2000; Fierro et al., 2005). Therefore, it is possible that the ability of our subjects to perceive phosphenes would result from an enhancement in overall visual cortex excitability over time as a consequence of visual deprivation. Several lines of evidence argue against this point. First, subjects' PTs were measured in a previous training session and again during the experiment immediately before each experimental block. Second, only subjects with a stable PT were enrolled in the study. Third, an analysis of PTs obtained before each experimental block revealed no statistical difference when comparing across blocks within each condition. Furthermore, the crossmodal effect is spatially and temporally specific despite the conditions being presented in a randomized manner. Hence, if an isolated overall increase in visual cortex excitability accounted for the observed effect, an enhancement of crossmodal interactions across all, but not certain specific conditions, would have occurred.
In summary, these results suggest the existence of specific pathways linking specialized areas across sensory modalities. Furthermore, these crossmodal sensory interactions can be revealed under subthreshold conditions and follow principles of spatial and temporal congruency.
This was supported by National Institutes of Health Grants K24-RR018875 and RO1-EY12091 (A.P.-L.) and K23-EY016131 (L.M.).
- Correspondence should be addressed to Dr. Alvaro Pascual-Leone, Center for Noninvasive Brain Stimulation, Beth Israel Deaconess Medical Center–Harvard Medical School, 330 Brookline Avenue, Boston, MA 02215.