Multisensory neurons in the superior colliculus (SC) have the capability to integrate signals that belong to the same event, despite being conveyed by different senses. They develop this capability during early life as experience is gained with the statistics of cross-modal events. These adaptations prepare the SC to deal with the cross-modal events that are likely to be encountered throughout life. Here, we found that neurons in the adult SC can also adapt to experience with sequentially ordered cross-modal (visual-auditory or auditory-visual) cues, and that they do so over short periods of time (minutes), as if adapting to a particular stimulus configuration. This short-term plasticity was evident as a rapid increase in the magnitude and duration of responses to the first stimulus, and a shortening of the latency and increase in magnitude of the responses to the second stimulus when they are presented in sequence. The result was that the two responses appeared to merge. These changes were stable in the absence of experience with competing stimulus configurations, outlasted the exposure period, and could not be induced by equivalent experience with sequential within-modal (visual-visual or auditory-auditory) stimuli. A parsimonious interpretation is that the additional SC activity provided by the second stimulus became associated with, and increased the potency of, the afferents responding to the preceding stimulus. This interpretation is consistent with the principle of spike-timing-dependent plasticity, which may provide the basic mechanism for short term or long term plasticity and be operative in both the adult and neonatal SC.
Multisensory superior colliculus (SC) neurons craft their ability to integrate multiple sensory inputs based on experience. At birth, the few sensory-responsive neurons in the cat SC respond exclusively to tactile stimulation (Stein et al., 1973). Multisensory neurons appear several days later and, like those in monkey, have very large receptive fields and are unable to engage in multisensory integration. Responses to cross-modal stimuli are no greater or weaker than those to the most effective of them alone (Wallace and Stein, 2000, 2001). These observations, and those from studies of human infants (Neil et al., 2006; Putzar et al., 2007; Gori et al., 2008), emphasize the gradual postnatal elaboration of multisensory integration. Presumably, experience with the statistics of the natural world, in which cross-modal cues derived from common events occur in close correspondence (Stein and Stanford, 2008), leads a neuron's different receptive fields to shrink into register and establishes the spatial principle of multisensory integration: coincident cross-modal stimuli elicit more robust responses than the most effective component stimulus, and disparate cross-modal stimuli either elicit no integration or response depression (Meredith and Stein, 1986). Similarly, animals reared with spatially disparate visual and auditory stimuli develop spatially disparate receptive fields and enhanced responses to disparate stimuli (Wallace and Stein, 2007).
During this formative period the brain gradually adapts its information processing capabilities to deal with the presumptive environment in which it will function (Rauschecker, 1991). Spike-timing-dependent plasticity (STDP) provides a framework for understanding these adaptations by codifying Hebb's (1949) original learning rule in which connections from one neuronal ensemble to another are strengthened if activity in the originator produces activity in the target (Gerstner et al., 1996; Senn et al., 2001). This learning rule is used in many areas of the brain, especially unisensory cortex (De Felipe et al., 1997; Schuett et al., 2001; Fu et al., 2002; Fu et al., 2004; Jacob et al., 2007; Young et al., 2007; Dahmen et al., 2008; see also Dan and Poo, 2004). Its principles are consistent with the developmental observations described above in that different SC sensory afferents are coactivated by common events and would be predicted to mutually reinforce each other. As a consequence, STDP might be engaged in defining the overlapping receptive fields and integrative properties of SC neurons during development and building the overarching structure to guide multisensory integration (Friedel and van Hemmen, 2008).
However, once this structure is in place there still remains a need for rapid adaptations to accommodate immediate, possibly unusual, circumstances indicated by changes in the statistics of cross-modal events, especially at maturity when animals are independent. The data obtained here reveal that adult SC neurons can adapt their fundamental properties to cross-modal experience over short time scales (minutes) in ways predicted by the STDP learning algorithm. This suggests that adult multisensory SC neurons adapt to the statistics of short-term cross-modal sensory experiences, much as their neonatal counterparts adapt to the statistics of long-term sensory experiences.
Materials and Methods
All protocols used were in accordance with the Guide for the Care and Use of Laboratory Animals (National Institutes of Health Publication 86-23) and were approved by the Animal Care and Use Committee of Wake Forest University School of Medicine, an AAALAC-accredited institution.
Three adult cats were implanted with stainless-steel recording chambers using previously described techniques (McHaffie and Stein, 1983). Briefly, each animal was first anesthetized with a mixture of ketamine hydrochloride (20 mg/kg, i.m.) and acepromazine maleate (0.4 mg/kg, i.m.). The animal was then intubated, placed in a stereotaxic head holder, and anesthetized for surgery with isoflurane (1.5–3%). A recording cylinder was placed over a craniotomy to provide access to the SC via the overlying cortex and attached to the skull with stainless steel screws and dental acrylic. The animal was then allowed to recover from surgery. Postsurgical analgesics (butorphanol tartrate, 0.1–0.2 mg/kg and ketoprofen, 1 mg/kg) were administered as needed and antibiotics (cephazolin sodium, 25 mg/kg) were administered three times daily for 7 d.
Single neuron recording.
The chronic recording well/head holder provided a means of holding the animal during recording sessions without any wounds or pressure points. Weekly recording sessions began after a postsurgical recovery period of at least 7 d. For each session the animal was anesthetized with a combination of ketamine hydrochloride (20 mg/kg, i.m.) and acepromazine maleate (0.4 mg/kg, i.m.), intubated, and artificially ventilated. Respiratory rate and volume, heart rate, and blood pressure were monitored and end-tidal CO2 was maintained at ∼4.0%. Paralysis was induced with an injection of pancuronium bromide (0.06 mg/kg, i.v.) to fix the positions of the eye and ear. During each recording session, anesthesia, paralysis, and hydration were maintained by continuous intravenous infusion of ketamine hydrochloride (4–6 mg kg−1 h−1), pancuronium bromide (0.1–0.2 mg kg−1 h−1), and 5% dextrose in 0.9% normal saline (3–6 ml/h). Body temperature was kept at 37−38°C using a heating pad. The recording chamber/head-holder was attached to a metal frame that did not obscure the animal's sight or hearing and allowed unobstructed access to the SC. The pupil of the eye contralateral to the SC under study was dilated with 1% atropine sulfate, and a contact lens was placed on the cornea to correct its refractive error. An opaque lens occluded the other eye. At the end of the recording session, the anesthesia and paralysis were terminated. The animal was returned to its home cage once stable respiration and coordinated locomotion were reinstated.
Conventional methods were used for single neuron electrophysiological recording. The electrode (tip diameter: 1–3 μm, impedance: 1–3 MΩ at 1 kHz) was positioned on a microdrive stage and lowered to the SC. Once at the SC surface, the electrode was advanced with the hydraulic microdrive while visual and auditory search stimuli were presented. Single-unit neural activity was recorded, amplified, and routed to an oscilloscope, audio monitor, and computer for online and offline analyses as in the past (Alvarado et al., 2008).
Apparatus and search paradigm.
Sensory-responsive neurons were identified using a variety of visual and auditory stimuli. Visual search stimuli consisted of moving bars of light (projected from a LC 4445 Philips projector) onto a tangent screen located 45 cm from the front of the animal's head. Auditory stimuli consisted of broadband (20–20,000 Hz) noise bursts, clicks, and taps. Stimuli were controlled using custom software operating a NIDAQ digital controller (National Instruments) connected to a personal computer. When a multisensory (e.g., visual-auditory) neuron was isolated, its modality-specific visual and auditory receptive fields were mapped as in the past (Alvarado et al., 2008). The visual receptive field was mapped with moving light bars and the auditory receptive field was mapped with broadband noise bursts from any of 16 hoop-mounted speakers placed 15° apart and 15 cm from the head on a rotating hoop that permitted adjustments in elevation.
Auditory stimuli consisted of brief (50 ms) 60–65 dB sound pressure level (SPL) bursts of broadband noise presented against ambient background SPL 51.2–52 dB within the “best area” of the receptive field. Visual stimuli were 100–300 ms duration, ∼4° × 2° or 8° × 2° moving bars of light (1.68–13.67 cd/m2) of optimal orientation and velocity for each neuron, presented against a uniform gray background of 0.16 cd/m2 within the best area of the receptive field and spatially aligned with the auditory stimulus. Cross-modal stimulus combinations consisted of the sequential presentation of the visual and auditory stimuli aligned in space but separated in time. In this experiment, the configurations of cross-modal stimuli were different from what has usually been used in the past (King and Palmer, 1985; Meredith and Stein, 1986; Peck, 1996; Stein, 1998; Wallace et al., 1998; Jiang et al., 2001; Alvarado et al., 2007a,b), where visual-auditory stimuli were simultaneous or in close temporal register. Here, the sequential visual and auditory stimuli were separated in time by a minimal stimulus-onset asynchrony (SOA) that would evoke two separate responses. These minimal SOAs allowed quantification of the properties of each individual response while retaining the possibility of interactions between the sensory channels. In the present experiments, the sequential presentation of stimuli provided a circumstance in which the afferents conveying inputs from the first stimulus in the sequence were activated before the onset of the second response (driven by the afferents conveying inputs from the second stimulus). To determine the appropriate SOA for each experiment, an online estimate of each unisensory response latency and duration was first conducted by presenting individual test stimuli for 10–15 trials. The modality of the “first” (i.e., leading) and “second” (i.e., trailing) stimulus in the cross-modal stimulus pairing was arbitrarily selected for each neuron. Once these properties were determined, the neuron was subjected to the testing paradigm.
Figure 1 illustrates how SC responses to sequentially presented cross-modal stimuli might change as a consequence of STDP. Each stimulus generates a signal conveyed to the SC neuron through its many afferents, but is represented here as generating a single input trace. Each input trace induces activity in that SC neuron when it is above threshold. According to the principle of STDP, afferents will be potentiated when they are active before the activation of the target neuron. Thus, while both stimuli in the cross-modal sequence initiate activity in that neuron, afferents conveying inputs from the first stimulus are given a disproportionate amount of “credit” because their activation precedes both SC responses in the sequence. Afferents conveying inputs from the second stimulus, in contrast, only receive credit for the second response. Thus, the net effect of cross-modal exposure would be an enhancement of the afferents communicating the first stimulus in the sequence, especially afferents that are active at significant poststimulus delays; that is, afferents that are active closer to the initiation of the second response. Consequently, the response to the first stimulus would become larger and longer, potentially extending well beyond the separation of the first and second responses. The response to the second stimulus, when the stimuli are presented together, would become quicker and greater due to the increased input from the first stimulus. Repeated stimulation of the same afferents (as in within-modal stimulus pairings) would not consistently yield the same effects because, although it may alter the response, there is not the same inequality in the relationship between presynaptic and postsynaptic activity. In other words, the sequential cross-modal stimuli yield an inequality in the relationship between the presynaptic and postsynaptic activity that is not present when a within-modal stimulus is presented.
General testing paradigm.
Responses to visual, auditory, and sequential cross-modal stimuli were evaluated before, during, and after the presentation of a large number of “exposure” trials. Before the exposure trials, “baseline” data were acquired by presenting the visual, auditory, and sequential cross-modal stimuli in a randomly interleaved pattern for 20 trials, with 10–12 s intertrial intervals to avoid habituation associated with repeated stimulus presentation. During the exposure trials the same sequentially ordered cross-modal stimuli were presented for 50–80 trials with 10–12 s intertrial intervals. After the exposure trials, the responses to the auditory, visual, and sequential cross-modal stimuli were reevaluated using the same methods used to acquire the preexposure baseline data.
To determine whether the repetition of any stimulus would produce the effect, several variations of the paradigm were engaged. The first variation involved repetitions of a single, modality-specific stimulus. The second variation involved within-modal (visual-visual, or auditory-auditory) stimulus pairs. In both cases the specific stimuli chosen were the same as those used in the sequential cross-modal stimuli for that neuron. The parameters of the sequential within-modal (visual-visual or auditory-auditory) stimulus pairs also mimicked those involved in the cross-modal experiments.
Data acquisition and analysis.
Custom software acquired raw data waveforms and impulses from single neurons (after A/D conversion) identified using a threshold criterion of 3 × elevation of the action potential amplitude above background noise. Impulse times were recorded for each trial with a 1 ms resolution. The mean cumulative impulse count was computed from each raster by counting the number of impulses generated on or before each time bin and averaging across trials. This was then converted to the mean cumulative stimulus-driven impulse count (qsum) by subtracting, at each moment in time, the number of impulses expected given a linear extrapolation based on the mean “spontaneous” firing rate calculated in the 500 ms window preceding the stimulus. Changes in the slope of the qsum (forming angles in the graph) indicate changes in the underlying instantaneous firing rate. For responses to presentations of single modality-specific stimuli, a geometrically based algorithm was used to determine the onset and offset of the response, thereby defining the response window (Rowland et al., 2007, Rowland and Stein, 2008). This algorithm was adapted to identify the onsets and offsets of the two responses evoked by cross-modal stimuli by restricting the time windows in which the algorithm operated (either the window before or after the presentation of the second stimulus). The latency of each response was identified as the difference between the response onset and the corresponding stimulus onset. The duration of each response was identified as the difference between the response offset and response onset. The magnitude of each response was identified as the mean number of impulses occurring in the responses window minus the expected number given the spontaneous firing rate. Differences in the responses to different stimuli for each measure were quantified both in terms of raw difference and percentage difference.
Data were compared statistically to determine significant differences using algorithms in MatLab 6.0 (The MathWorks) and SPSS 11.5 (SAS). Latency and duration measures were computed over all trials in each response, and thus comparisons were made at the population level between different neurons using ANOVA, X2, and Kolmogorov–Smirnov tests where appropriate. As a general arbitrary standard, we considered a >30% change in the duration or a >5 ms change in its latency to be categorized as “substantial.” The magnitudes of the responses to different stimuli were compared to one another using t tests within each neuron.
The principal analysis focused on the differences in responses obtained before and after the exposure trials for each of the response measures (latency, duration, magnitude) and each of the stimulus types (visual, auditory, visual-auditory, visual-visual, auditory-auditory). Also identified were changes in the relationships between responses in a stimulus pair that were collected when the component stimuli were presented alone or in tandem (in other words, relationships evidencing multisensory interactions). Although all relationships were analyzed, in the analysis of multisensory interactions we present only the results for responses to the second stimulus in the pair, as changes in the responses to the first stimulus were largely the same regardless of whether it was presented alone or in tandem in the postexposure trials.
A total of 128 multisensory visual-auditory neurons were examined from all quadrants of the multisensory (deep) layers of the SC. Their receptive fields were topographically organized and the visual and auditory receptive fields of a given SC neuron were in good spatial register with one another. Their response properties were very similar to those previously reported. Thus, although not a focus of the present study, it was noted that spatiotemporally concordant cross-modal visual-auditory stimuli generally elicited significantly enhanced responses when simultaneously presented within their receptive fields as previously described (King and Palmer, 1985; Meredith and Stein, 1986; Peck, 1996; Kadunce et al., 1997; Stein, 1998; Wallace et al., 1998; Jiang et al., 2001; Perrault et al., 2003; Stanford et al., 2005; Alvarado et al., 2007a,b).
It was immediately clear that the response properties of many of these multisensory neurons changed in predictable ways when repeatedly stimulated with sequential cross-modal stimuli. The term “sequential cross-modal stimuli” is used here because these cross-modal stimuli were unlike those used in typical studies of multisensory integration, because the responses they evoked were clearly separated in time. Nevertheless, the use of the minimum SOA did not preclude interactions between them. Indeed, the essential finding here is that the responses that began as two seemingly independent unisensory responses tended to “merge” during repeated presentation of the sequential cross-modal stimuli into what appeared to be a single response. This effect proved to be a result of increases in the duration and magnitude of the first response and decreases in the latency and increases in the magnitude of the second response. This tendency was obvious in the first few neurons studied (n = 17), and four examples of the responses collected during their exposure trials are illustrated in Figure 2. Responses are displayed in rasters where sequential trials are ordered from bottom-to-top.
In the neurons shown in Figure 2A–D, the trend toward a merging of the two responses appeared to begin between trials 23–35 (A), 17–25 (B), 15–25 (C), and 25–35 (D). The progression was fully elaborated within comparatively few trials thereafter and, once established, did not change substantially with further repetitions of the sequential cross-modal stimuli. Although the two responses could still be identified by two “hot spots” of activity within the discharge trains, they were now no longer two independent responses separated by an intervening period of quiescence.
The tendency for the responses to merge was apparent in 70% (87/124) of the cases that were subsequently studied in greater detail with preexposure, exposure, and postexposure trials. In these cases the response trend generally (90%, 78/87) became apparent within 15–40 repetitions of the sequential cross-modal stimuli, with a discernable sequence of progressively more robust and longer responses to the first stimulus and a shortening of the latency of the second response. In rare cases (10%, 9/87), the trend required >50 repetitions of the stimulus complex before becoming apparent. However, based on the initial studies described above, a criterion was set: if the trend did not emerge after 80 repetitions, the case was operationally determined to be “unaffected” by the exposure experience (37/124, 30%) and the exposure trials were terminated.
To determine whether the observed effects were specific to the cross-modal exposure, or simply exposure to any stimulus, we studied the responses of 56 neurons to 50 repeated presentations of a single visual or auditory stimulus at equivalent rates (Fig. 3). Responses were measured separately for the first 20 and last 20 trials, and did not appear to change significantly or in a consistently predictable way. Mean response magnitudes were very similar in the first 20 trials (visual: 5.36 ± 3.58 impulses, auditory: 4.08 ± 3.77 impulses) and the last 20 trials (visual: 5.36 ± 3.34 impulses, auditory: 4.34 ± 3.72 impulses) (ANOVA, F(1,178) = 0.03, p = 0.859). Mean durations were also very similar in the first 20 trials (visual: 209.5 ± 106.3 ms, auditory: 168.5 ± 86.8 ms) and last 20 trials (visual: 222.9 ± 121.2 ms, auditory: 190.3 ± 113.8 ms) (Kolmogorov–Smirnov test; Z = 0.745, p = 0.635). In fact, very few of the samples showed significantly larger response magnitudes (visual: 8/56 = 14.3%, mean change of 3.9%; auditory: 5/34 = 14.7%, mean change of −0.3%) or substantial (>30%) increases in duration (visual: 8/56 = 14.3%, mean change of 1.7%; auditory: 3/34 = 8.8%, mean change of 2.9%) over the modality-specific exposure period. Thus, the effects produced by exposure to sequential cross-modal stimuli did not appear with repeated presentation of modality-specific stimuli.
Figure 4 illustrates the responses of a typical neuron to visual, auditory, and sequential auditory-visual stimuli collected before and after 50 exposures to the sequential auditory-visual stimuli. As seen in Figure 4A, the neuron's visual and auditory receptive fields were in spatial register with one another and the interleaved visual, auditory, and auditory-visual stimuli were delivered at overlapping locations within the best areas of each receptive field. From the preexposure impulse rasters illustrated in Figure 4B it is clear that each modality-specific stimulus evoked a well defined response, and that the sequential cross-modal combination (in this case, auditory before visual) evoked two distinct sequential responses, one auditory and one visual. There were no consistently predictable changes in the neuron's responses over the course of these randomly interleaved preexposure trials. Despite this, the response to the first stimulus still could influence the latency of the second response, which was 92 ms when they were presented together but 100 ms when the second stimulus was presented alone.
In the postexposure rasters (also in Fig. 4B), it is clear that several significant changes in the response had taken place. The duration and magnitude of the response to the first stimulus increased; the duration and magnitude of the response to the second stimulus increased; and the latency of the response to the second stimulus dramatically decreased: it was now 65 ms when presented with the first stimulus versus 96 ms when it was presented alone. These changes developed within the first 20 exposure trials (Fig. 4D) and persisted despite the randomly interleaved nature of the postexposure trials. The impact of the exposure trials on these responses was not dependent on a particular modality-specific stimulus being first or second in the sequentially ordered pair. To illustrate this fact, the example neuron provided in Figure 5 was tested with the reverse sequence of stimuli (i.e., the visual stimulus preceded the auditory). Postexposure trials once again revealed that the duration and magnitude of the response to the first stimulus in the sequence increased. In this particular example, the magnitude of the second response did not significantly change. Within the preexposure period, the latency of the response to the second stimulus was shorter when presented in tandem with the first (16 ms) versus when it was presented alone (25 ms). In this example these latencies were not substantially shortened further in the poststimulus trials, possibly because they were already so short.
To confirm the generality of these observations, a group of neurons (n = 13) was studied in which the two possible cross-modal exposure sequences (visual before auditory, auditory before visual) were presented to each neuron. The initial test order was varied, but the merging of responses was seen in 9/13 of the neurons regardless of stimulus order. Both the magnitude and duration of the response to the first stimulus increased, while the only consistent change in the response to the second stimulus was a shortening of its latency when they were presented in tandem. In 3/13 neurons the changes were extinguished when the stimulus order was reversed. Another neuron did not show any changes when either order was presented.
Described below are the data trends that were evident across the entire population of neurons studied. In our descriptions of the data, we provide the frequency with which different observations were observed and the mean percentage change in each measure across the population; thus for example, the notation “A/B = C%, mean change of D%” indicates that A of the B studied neurons (representing C% of the studied population) showed a significant change, while the mean proportionate change across the entire population after the exposure trials (including data from samples that did not show significant changes) was D%.
Sequential cross-modal exposure-induced changes that were evident when modality-specific stimuli were presented individually during postexposure testing
Figure 6 details the changes in responses to modality-specific stimuli after exposure to sequential cross-modal stimuli across the population of studied neurons (N = 111, 13 of which were sampled twice with opposite cross-modal stimulus sequences [see above]). The results are presented separately for (1) each of the response measures (magnitude, latency, and duration), (2) stimulus modality (visual or auditory), and (3) order in the sequential cross-modal stimuli (first or second). The scatter plots on the left (Fig. 6A,C,E) compare the response properties before exposure (x-axis) and after exposure (y-axis), while the summary figures on the right (Fig. 6B,D,F) show the cumulative distributions of the percentage difference for each response property before and after the exposure trials. As described above, the order of the stimuli in the sequential cross-modal exposure trials was the primary factor in determining the effects, not stimulus modality. Thus, the effects on the response to the first stimulus in the sequence were qualitatively the same regardless of whether it was visual or auditory. Consequently, the visual-first and auditory-first samples are combined in the summary figures.
One consistent result of the cross-modal exposure trials across neurons was that the magnitude of the response to the first stimulus in the sequence significantly increased in most cases (visual: 30/55 = 54.5%, mean change of 37.2%; auditory: 46/69 = 66.7%, mean change of 86.2%) (ANOVA, F(1,246) = 10.883, p < 0.01) when that stimulus was presented alone (Fig. 6A,B). The duration of the response to the first stimulus also often increased >30% when it was presented alone (visual: 33/55 = 60.0%, mean change of 43.2%; auditory: 39/69 = 56.5%; mean change of 78.7%) (ANOVA, F(1,246) = 23.131, p < 0.001) (Fig. 6C,D). There was no appreciable effect of the exposure trials on the latency of the response to the first stimulus (ANOVA, F(1,246) = 0.001, p = 0.97).
The effects of the exposure trials on the response to the second stimulus in the sequence (when it was later presented alone) were far less consistent than the effects on the first. After the exposure trials, the responses of the second stimulus when presented alone were not consistently significantly larger in magnitude (visual: 24/69 = 35%, mean change of 26.1%; auditory: 11/55 = 22%, mean change of 10.0%) (ANOVA, F(1,246) = 0.596, p = 0.441) or >30% longer in duration (visual: 14/69 = 20.3%, mean change of 19.0%; auditory: 6/55 = 10.9%, mean change of 10.6%) (ANOVA, F(1,246) = 0.885, p = 0.348). It was relatively uncommon (4/20 = 20%) to observe more than a 30% increase in the duration of the response to the second stimulus in the absence of a similar increase in the duration of the response to the first stimulus. There was no appreciable effect of the exposure trials on the latency of the response to the second stimulus when it was presented alone (ANOVA, F(1,246) = 0.012, p = 0.912).
To more clearly examine the change of the first response after sequential cross-modal exposure, the firing rate (response magnitude/duration) was examined in a subset of cases (n = 72), where the duration of the first response increased >30% after the exposure trials. As shown in the individual example presented in Figure 7A, exposure significantly increased the overall firing rate of the response to the first stimulus, even when it was presented alone as shown here, and this proved to be characteristic of the majority of cases (Fig. 7B). However, the change in the magnitude of the first response was not uniform over the duration of the response, but became larger as the response unfolded. In only a minority of cases (21/72 = 29.2%) there was a significant increase in firing rate during the initial (i.e., first half) portion of a response, and even in these cases the size of that effect was small (Fig. 7C). In contrast, the overwhelming majority (52/72 = 72.2%) of responses to the first stimulus showed a significant firing rate increase during the second half of the response (X2 test, df = 1, X2 = 26.7, p < 0.001), and that effect size was much larger (Fig. 7D). The cumulative distribution frequency in Figure 7E shows that the enhanced firing rate was far greater in the second half (second window) of the response.
Sequential cross-modal exposure-induced changes that were evident when stimuli were presented together in sequence
Many of the exposure-induced changes noted above were also apparent in postexposure trials in which the two stimuli were presented in tandem (Fig. 8). That the changes in response magnitude and durations in the first response were similar to what is described above was expected, given that the SOA was great enough so that the first response was ending before the second stimulus was delivered. However, the exposure trials had a substantial impact on the response to the second stimulus when it was presented in tandem with the first. In these cases, the response to the second stimulus in the sequence was typically significantly larger in magnitude (visual: 45/69 = 67.2%, mean change of 54.4%; auditory: 35/55 = 64.8%, mean change of 71.7%) (ANOVA, F(1,246) = 11.352, p = 0.001) and >30% longer in duration (visual: 41/69 = 59.4%, mean change of 57.3%; auditory: 31/55 = 56.4%, mean change of 57.1%) (ANOVA, F(1,246) = 24.868, p < 0.001). In addition, the latency of the response to the second stimulus was now usually much shorter (i.e., >5 ms) (visual: 47/69 = 68.1%, mean change of −11.2 ms or −11.2%; auditory: 23/55 = 41.8%, mean change of −4.7 ms or −16.9%) (ANOVA, F(1,246) = 24.868, p < 0.001). These combined changes in the first and second responses shortened the gap between them, giving the impression of a more continuous, or single “merged” response.
Sequential cross-modal exposure-induced enhancement of multisensory integration
As noted above, the long SOAs were such that the first response ended before the second response began. Thus, exposure-induced changes in the first response were no different when it was presented alone or in the cross-modal sequence. However, changes in the second response differed in these two configurations, raising the possibility that despite the seeming independence of the two responses, there was some interaction between their inputs even before exposure. This would indicate that even this nontraditional stimulus configuration could engage the neuron's capacity for multisensory integration. To examine this possibility, and the possibility that these relationships changed as a consequence of the exposure to the cross-modal stimulus, two comparisons were made. The first involved a comparison of the relative changes induced by the sequential cross-modal stimuli (i.e., relative to the responses evoked by a single modality-specific component stimulus) before the exposure period. This would determine whether any multisensory integration was already taking place despite the presence of two distinct, and apparently “unisensory” responses. The second involved a comparison of these relative differences before and after the exposure period to determine whether multisensory integration itself was altered by the exposure period (Fig. 9).
These comparisons revealed that multisensory integration was already present (albeit minimal) in some neurons in the preexposure responses. However, multisensory integration was substantially elevated following the exposure trials (Fig. 9A–C). This was evident in both response magnitude (before exposure: 31/124 = 25.0%, mean change of 13.7%; after exposure 73/124 = 58.9%, mean change of 50.6%;) (X2 test, df = 1, X2 = 29.212, p < 0.001, ANOVA, F(1,246) = 28.675, p < 0.001), and in response duration (before exposure: 30/124 = 24.2% showed >30% changes, mean shift of the entire population 23.1%; after exposure: 75/124 = 60.5% showed >30% changes, mean shift of the entire population 71.9%;) (X2 test, df = 1, X2 = 33.447, p < 0.001, ANOVA, F(1,246) = 25.88, p < 0.001). Also consistent with previous observations (Rowland et al., 2007; see also Bell et al., 2006; Wang et al., 2008; Royal et al., 2009; Zahar et al., 2009), the latency of the response to the second stimulus was shorter when that stimulus was embedded in the sequential cross-modal paradigm than when presented alone. However, this latency was further shortened after exposure (before exposure: 43/124 = 33.6% showed >5 ms shifts, mean change of the entire population −4.9 ms or −17.8%; after exposure: 81/124 = 65.3% showed >5 ms shifts, mean change of the entire population −12.0 ms or −6.7%) (X2 test, df = 1, X2 = 23.290, p < 0.001, ANOVA, F(1,246) = 43.271, p < 0.001).
Sequential within-modal exposure did not produce equivalent effects
A subset of neurons (n = 64) was examined to determine whether results like those above could be obtained with within-modal stimulus pairs. Exposure trials were composed of sequential visual-visual (n = 47) or auditory-auditory (n = 17) stimuli, at equivalent SOAs and iterative rates as the sequential cross-modal stimuli. However, they rarely produced predictable or consistent response changes. Figure 10 shows the results obtained from a typical experiment with sequential visual-visual exposure trials and Figure 11 shows results obtained from a typical experiment with auditory-auditory exposure trials. Data from the population of neurons tested in this way are shown in Figure 12. It is evident from this figure that the exposure trials had very little effect on the responses of the population of multisensory neurons. Neither the first response nor the second response showed substantial and/or reliable changes in duration (Fig. 12A) or magnitude (Fig. 12B). Thus, few of the first responses were significantly enhanced after exposure (visual: 4/47 = 8.5%, mean change of −12.2%; auditory: 1/17 = 5.9%, mean change of 1.9%), and virtually none of the second responses significantly increased in magnitude (visual: 3/47 = 6.4%, mean change of −11.2%; auditory: 1/17 = 5.9%, mean change of −1.8%). Response durations of the first response were rarely increased by >30% after the exposure trials (visual: 5/47 = 10.6%, mean shift of 0.2%; auditory: 2/17 = 11.8%, mean shift of 1.5%), and the same was true for response durations of the second response (visual: 3/47 = 6.4%, mean shift of −4.0%; auditory: 0/17 = 0%, mean shift of 1.5%). As shown in Figure 12C, repetitive within-modal exposure also generally failed to shorten the latencies of response to second stimulus in the sequence by >5 ms (13/64 = 20.3%, mean change of −1.8 ms; mean shift of −2.3%).
The present demonstration, that SC neurons rapidly alter their responses to adapt to repeated cross-modal sensory stimuli, demonstrates that the neural mechanisms underlying multisensory processes retain plasticity well into adulthood. The neural adaptations to repetitions of the same stimulus configuration were manifested in several consistent response changes. The magnitude and duration of the neuron's response to the first stimulus in the sequence were increased, and the magnitude of its response to the second stimulus was also increased. These changes were evident regardless of whether the stimuli were presented individually or in sequence during the postexposure trials. In addition, the latency of the second response was significantly decreased, but this was only evident when the stimuli were presented in tandem.
The effect of these changes was to transform what had been two clearly defined unisensory responses into to a single continuous response with two hotspots. The response changes developed rapidly during the stimulus exposure period, and were independent of the particular order in which the visual and auditory stimuli were presented. Thus, they were just as evident after exposure periods in which the auditory stimulus preceded the visual as when the visual stimulus preceded the auditory. Furthermore, they far outlasted the period of exposure, and some of them were evident even when the component stimuli were presented independently. However, this was very much a cross-modal-dependent effect: comparable changes could not be produced by repeated presentation of either the visual or auditory stimulus independently, or by sequential within-modal (visual-visual, or auditory-auditory) stimuli even though their parameters were similar to those in the cross-modal sequence. We assume that this disparity is due to the fact that, in the within-modal trials, the afferents repeatedly stimulated were predominantly the same. This contrasts with the sequential cross-modal stimuli, whose presentation could cause the afferents conveying signals from the first stimulus to be given undue credit for the activation actually produced by the second.
By using activation of two separate input channels representing different senses, the present experiments were able to demonstrate the presence of an in vivo sensory adaptation that is consistent with the STDP learning algorithm (Dahmen et al., 2008). These observations suggest that multisensory circuits in the midbrain share a common algorithm with adaptations observed in vitro within unisensory cortical circuits, where separate afferents can also be manipulated independently, albeit with direct electrical activation (De Felipe et al., 1997; Schuett et al., 2001; Fu et al., 2002, 2004; Mu and Poo, 2006; Jacob et al., 2007; Young et al., 2007). Perhaps this commonality should not be surprising, as STDP provides a simple computational method for constructing circuits that bind together distinct afferents with their target neurons via their tightly timed pre/postsynaptic sequential activation; and this is just as important when binding different features within a single sensory representation in cortex as when binding different sensory representations in the midbrain.
Key features in the present experimental paradigm include the common period in which both components of the cross-modal stimulus can affect the target neuron, and the interval between these stimuli being just long enough to make possible the examination of changes in the two responses independently. This latter factor is relevant because the learning algorithm makes different predictions for these two responses. The first response should be enhanced because its afferents are becoming more effective (e.g., their presynaptic and/or postsynaptic components are being enhanced) as a consequence of the activity induced in the target neuron by the second stimulus. This is because the increased firing of the target neuron follows, and therefore appears to be related to, the activation of the first set of afferents, even though the additional responses elicited are somewhat separated in time from activation of the first set of afferents, and are initiated by different agents (stimuli and afferents). Although the activity of the postsynaptic target neuron also follows the activity of the second set of afferents, they do not have the same functional relationship to that neuron as does the first set of afferents (activation of the first set of afferents precedes all the neuron's activity), and according to the rule, would not be subject to the same potentiation.
Another perspective in interpreting these results is that, as a consequence of the cross-modal experience, the neural circuit adapts by “anticipating” the occurrence of the second stimulus in a cross-modal sequence. This is particularly interesting in this context because not only were the stimuli neither meaningful nor linked to reward, but these experiments were conducted in anesthetized preparations. Thus, these changes appear to reflect the operation of a mechanism basic to the circuit and not dependent on a high level of awareness. The latency shifts of the second response are particularly important in this context. Although their presence in the preexposure condition is consistent with earlier demonstrations that shortening response latency is a common consequence of multisensory integration (Rowland et al., 2007), their sensitivity to sensory experience seems especially useful in speeding overt responses to later stimuli in a sequence. It would be particularly interesting to know whether the response changes induced here during anesthesia would affect an animal's postanesthesia behavior as predicted, and if so, how closely the neural and behavioral changes would parallel those observed after parallel experiences in alert animals. But this remains to be determined.
It is interesting to consider the possibility that the mechanisms underlying the adult multisensory response plasticity demonstrated here are the same mechanisms involved in instantiating this capacity during early life. The SC of the fetal and/or newborn cat is unisensory, with its sensory-responsive neurons being activated only by somatosensory stimuli (Stein et al., 1973). Its neurons do not respond to multiple sensory modalities, and even when multisensory SC neurons first appear during early postnatal maturation, they are incapable of multisensory integration (Wallace and Stein, 1997, 2000, Stein, 2005; see also Wallace and Stein, 2001). They acquire this capability only after many weeks of postnatal experience during which they are presumably learning the statistics of cross-modal events. After a sufficient period of exposure, they are able to integrate multiple sensory inputs and enhance their responses to spatially and temporally concordant cross-modal stimuli (Wallace et al., 2004). Although there is no incontrovertible proof that the STDP algorithm is the one used by the SC here, or in early development, it seems ideal for the entrainment of the principles governing multisensory integration at all times of life.
The use of the same mechanisms for ensuring plasticity in multisensory integration during its acquisition in early life and during adulthood would be a parsimonious neural strategy involving two functional stages. In the first stage, the “superstructure” for multisensory integration would be constructed by adapting its governing principles to the general statistics of the environment in which it would presumably be used. In normal circumstances this requires incorporating the spatial and temporal commonalities of the cues derived from the same event, but it also involves dealing with the fact that the greatest difficulty in detecting, localizing, and identifying events occurs when they are weak or ambiguous. Thus, the largest proportionate multisensory enhancements should be derived from combinations of weakly effective cross-modal cues to make the system maximally useful as expected by the principle of inverse effectiveness (for a recent discussion, see Stein et al., 2009). The second stage (adult) would reflect the flexibility in the use of these principles necessary to accommodate changes in the statistics of the cross-modal events that are peculiar to a given environmental circumstance.
The mechanisms that adapted multisensory integration to the stimulus conditions in the present study were selective. They did not produce equivalent changes when exposure involved repeated modality-specific or within-modal stimuli. This would not be surprising if the STDP learning algorithm does, in fact, underlie these changes, because the within-modal stimuli primarily activated a common set of afferents. Thus, there were not two distinct afferent pathways to associate. These results may reflect the inherent difference between deriving information from a single sensory channel and from multiple sensory channels. Although adaptation to these modality-specific (and/or within-modal) stimulus conditions can take place via habituation or sensitization (Stein and Meredith, 1993), it does suggest that the system recognizes the difference in stimulus contingencies, and that the underlying neural mechanisms involved in adapting to them differ. This may reflect the different circuits involved. For example, multisensory integration is dependent on the functional integrity of unisensory cortico-collicular afferents from the anterior ectosylvian and rostral lateral suprasylvian cortices (Jiang et al., 2001; Stein, 2005), but unisensory SC responses, even those of multisensory neurons, are not (Alvarado et al., 2007, 2008). In some cases these unisensory and within-modal responses are unaffected by temporarily eliminating these cortical inputs, and in others their responses are only modestly degraded. Thus, the multisensory and unisensory neural circuits are substantially different, and these circuits are likely to have distinctive ways of adapting to sensory experience.
This research was supported by National Institutes of Health Grants EY016716 and NS036916 and a grant from the Wallace Foundation. We thank Nancy London for technical assistance.
- Correspondence should be addressed to Barry E. Stein, Department of Neurobiology and Anatomy, Wake Forest University School of Medicine, Winston-Salem, NC 27157-1010.