Elsevier

Brain Research

Volume 1084, Issue 1, 21 April 2006, Pages 115-122
Brain Research

Research Report
Auditory dominance in the error correction process: A synchronized tapping study

https://doi.org/10.1016/j.brainres.2006.02.019Get rights and content

Abstract

The goal of this study was to reveal auditory dominance in the error correction process that operates in the synchronized tapping paradigm. We presented six female subjects with a sound and a flash, alternately and successively. The subjects' task was to tap in synchrony with the sequence of the attended to modality (the target sequence). The inter-onset interval of the target sequence was 996 ms but varied irregularly between 996 − α ms and 996 + α ms for the distractor sequence. We found that tapping accuracy was influenced by the irregularity of the distractor sequence. As distractor stimuli were temporally separated from target stimuli, auditory–visual integration in timing did not occur in our experiment. Therefore, the irregularity of the distractor sequence must have directly influenced the error correction process in timing and not indirectly by affecting the target sequence to subsequently influence the error correction process. We also found that tapping accuracy was influenced by the irregularity of the auditory distractor, even when the irregularity was unnoticeable, but it was not affected by the irregularity of the visual distractor when the irregularity was unnoticeable, which suggests that the error correction process is located before, or is independent of, the perception of irregularity in timing. We conclude that the error correction process depends more on temporal information from the auditory system than on information from the visual system.

Introduction

We sometimes sway unintentionally to musical rhythms. We can also move our bodies in synchrony with the rhythm of a light or a moving image. Sensorimotor coordination, i.e., moving to an aurally or visually presented rhythm, has been investigated by many researchers. Synchronized tapping, in which a sequence of clicks or flashes is presented to function as a pacing signal and subjects are asked to tap in synchrony with the sequence, constitutes a simple paradigm that is used to investigate sensorimotor coordination (e.g., Refs. Bartlett and Bartlett, 1959, Dunlap, 1910, Klemmer, 1967, Kolers and Brewster, 1985). People often find it easier to tap in synchrony with auditory than with visual pacing sequences. Indeed, in the synchronized tapping paradigm, the majority of studies use auditory stimuli (clicks or tones) as pacing sequences; visual stimuli (flashes) are rarely used. There are several reasons for using clicks (e.g., tapping behavior occurs more naturally when listening to music than when watching a movie, and synchronized movement is essential for playing music); it has also been found that audition is more accurate than vision with regard to temporal resolution (e.g., Ref. Fujisaki and Nishida, 2005). In fact, tapping accuracy is greater if pacing signals are presented aurally rather than visually (Dunlap, 1910, Kolers and Brewster, 1985).

However, there appears to be another reason for tapping being easier with auditory pacing sequences. Recently, Aschersleben and Bertelson (2003) demonstrated the so-called “temporal ventriloquism effect” (see also Ref. Morein-Zamir et al., 2003). They presented bimodal (auditory and visual) pacing sequences to subjects and asked them to attend to either modality and to tap in synchrony with the sequence of their chosen modality. The unattended sequence had a stimulus onset asynchrony (SOA) of −45 to 45 ms, and the effects of the SOA on the asynchronies between taps and the stimuli in the attended pacing sequence were measured. When subjects attended to the visual stimuli, tap-signal asynchrony was affected by the SOA; however, the corresponding effect in the case of an auditory target sequence was relatively small. These results suggest that auditory stimuli preferentially engage the tapping movement.

Repp and Penel (2002) further investigated auditory dominance in synchronized tapping. In their experiments, unimodal (auditory or visual) or bimodal pacing sequences were presented, and subjects were asked to tap in synchrony with the visual sequence in the bimodal condition. One stimulus was shifted from where it would have been had it been presented isochronously. The magnitude of the onset shift was less than 80 ms; this shift, however, induced the subjects' next tap to be shifted in the same direction. Repp (2002) called this immediate, obligatory response the phase correction response. When the sequence was presented bimodally, phase correction responses were larger for an auditory stimulus onset shift than for a visual stimulus onset shift, even though the subjects had been instructed to attend to the visual sequence. Interestingly, the phase correction response occurred even if the stimulus onset shift was so small that the subjects were not aware of it (see also Refs. Repp, 2000, Repp, 2001a for an auditory-only condition). In summary, auditory information appears to be more dominant than visual information in synchronized tapping.

Synchronized tapping is a form of timekeeping behavior. In a synchronization task, there is a stage at which the timing of the external stimulus and that of the sensory feedback (or efferent copy of the motor command) are compared (see Refs. Aschersleben, 2002, Mates, 1994a for reviews). To match the timings of two kinds of events, the subject has to anticipate when the pacing signal occurs. The ability to anticipate timing means that there is some kind of timekeeping system in the brain. Studies of continuation tapping have also suggested the presence of a timekeeping system (Wing and Kristofferson, 1973a, Wing and Kristofferson, 1973b, Wohlschläger and Koch, 2000). However, this timekeeping system is, of course, imperfect, and there is a need for continuous adjustment (i.e., error correction). Some researchers have proposed that this process is composed of two subprocesses: phase correction and period correction (Mates, 1994a, Mates, 1994b, Repp, 2001a, Repp, 2001b, Repp, 2003b). By their definitions, phase correction is a mechanism that minimizes the subjective asynchrony between tap onset and stimulus onset, and period correction is a mechanism that minimizes the difference between the subjective length of the inter-tap interval and the inter-onset interval of the pacing sequence. This is called the dual process model (Repp, 2001b).

The main question of this study is as follows. Does the auditory dominance in tapping come from auditory dominance in the error correction process? Auditory dominance in tapping has been observed repeatedly, but the explanation for this dominance remains unclear. Repp and Penel explained auditory dominance in terms of phase correction (Repp and Penel, 2002). However, it is possible that auditory dominance can be explained from another perspective: auditory–visual integration. In the following paragraph, we review studies of auditory–visual integration in the time domain.

It is well known that when an auditory stimulus and a visual stimulus are presented almost simultaneously, the perceived spatial location or time of occurrence of one stimulus biases that of the other. The ventriloquism effect is the most famous example of auditory location being biased in the direction of the location of the visual stimulus (e.g., Refs. Bermant and Welch, 1976, Bertelson and Radeau, 1981). In the temporal dimension, however, dominance of audition over vision is often observed. A change in the rate of an auditory sequence causes the perceived rate of a simultaneously presented, but constant, visual sequence to change as well (Gebhard and Mowbray, 1959). This is called auditory driving. Shams, Kamitani, and Shimojo observed that when a single flash of light was accompanied by multiple auditory beeps, the single flash was perceived as multiple flashes (Shams et al., 2000, Shams et al., 2002). An evoked response potential (ERP) experiment was used to examine this phenomenon, and significantly higher oscillatory and induced gamma band responses were observed in illusion versus no-illusion trials (Bhattacharya et al., 2002). Fendrich and Corballis (2001) also reported auditory dominance in temporal perception. They asked the subjects to judge when a flash or a click occurred by reporting the clock position of a rotating visual marker. When a click was presented before or after a flash, judgments of the temporal position of the flash were strongly biased in the direction of the click. A weaker biasing effect occurred when a flash was presented before or after a click whose temporal position was to be judged. How long is the temporal window within which temporal information from audition and vision is integrated? We think that 200 ms is an appropriate value for the temporal window based on these studies and others (Bertelson and Aschersleben, 2003, Fendrich and Corballis, 2001, Lewald et al., 2001, Morein-Zamir et al., 2003, Repp and Penel, 2004, Shams et al., 2000, Shams et al., 2002).

These studies suggest that, in synchronized tapping, the onset of tapping may be biased in the direction of an auditory distractor stimulus if a visual target stimulus is followed or preceded by an auditory distractor stimulus within a temporal distance of up to 200 ms, i.e., the temporal information of the visual stimulus per se may be biased by the existence of an auditory stimulus, and the biased temporal information is input by the error correction process mentioned above. Therefore, the results of previous studies (Aschersleben and Bertelson, 2003, Repp and Penel, 2002) can be understood in terms of an auditory bias in the visual stimulus.

Repp and Penel (2002) found that the auditory dominance in tapping was independent of the auditory–visual (AV) integration in their experiment, even when auditory visual stimuli were presented in the range of the AV integration. In that study, a perceptual test showed AV integration in half the subjects but not in the other half, whereas all subjects were observed to have auditory dominance in tapping. However, we believe that this perceptual test was inadequate for measuring AV integration because a subject had to perform the perceptual test and the synchronization task at the same time, which may have distorted the subject's answer. That is, it may be that the subject could not notice the AV integration if the temporal reference of regularity was also distorted by the temporal information of finger movement, efferent copy of the motor command, or sensory feedback from the finger. We also note that an auditory distractor near an auditory target (same modality) also affects the tap-target asynchrony (Repp, 2003a, Repp, 2004b). Thus, the integration may be independent of modality. Of course, another explanation might be possible, such as subject's attention converging on the auditory distractor stimulus.

Recently, Repp and Penel (2004) have shown auditory dominance in synchronized tapping using extensive manipulation of experimental parameters such as existence/nonexistence of continuous perturbation and relative phases of auditory and visual sequences. However, at a larger temporal distance between auditory and visual stimuli (the maximum was 320 ms), they could not find significant auditory dominance in tapping because of large individual differences (Section 3 of experiment 1 of Ref. Repp and Penel, 2002), although they did find significant auditory dominance in tapping at smaller temporal differences. Repp and Penel discussed the auditory dominance in tapping in terms of a dual process model, but they also noted the possibility of AV integration occurring because the auditory dominance was dependent on the absolute (rather than relative) temporal distance between auditory and visual stimuli.

Although these previous studies show that phase correction is a potential candidate for explaining auditory dominance in tapping (Repp and Penel, 2002), it also remains possible that auditory dominance in tapping may be explained in terms of AV integration, which does not seem to have been completely excluded as an explanation given the framework of the previous studies.

In this context, it is worth examining a case in which the auditory dominance in tapping is clearly shown under a condition in which no AV integration occurs. Thus, to reveal the effect of stimulus modality on the error correction process, the auditory and visual stimuli should not be presented in temporally close conditions (i.e., within the temporal window of integration). In our study, we used perturbed sequences, as in previous research (Repp and Penel, 2004), but always presented the sequences in the antiphase to prevent temporal AV integration. The average inter-onset interval of the target and distractor stimuli was set to 498 ms. It was beyond the scope of this study to determine which subprocesses of the error correction process cause the auditory dominance in tapping. This is an issue to be discussed in the future.

First, we sought to replicate the target modality effect by confirming that tapping was more accurate with auditory pacing signals than with visual pacing signals. Second, to reveal the auditory dominance in the error correction process, we measured the effects on tapping accuracy of large and small perturbations in the distractor sequence when presented roughly midway between the target stimuli. We used two different magnitudes of perturbation to investigate whether the perception of perturbation was essential to the error correction process. The magnitude of large perturbations was sufficient for subjects to notice the presence of the perturbation, while that of small perturbations was too small for subjects to notice. In addition, we estimated the effects of an unperturbed distractor stimulus presented midway between target stimuli on tapping. In some cases, it has been found that a stimulus presented midway between target stimuli improves tapping accuracy (e.g., Refs. Repp, 2003b, Wohlschläger and Koch, 2000). This effect is called the subdivision benefit.

Section snippets

Results

Fig. 1, Fig. 2 show the mean asynchrony and mean standard deviation, respectively, as functions of target modality and the distractor condition. It is known that tapping usually precedes the pacing signal (i.e., negative asynchrony) (Aschersleben, 2002). Our data also indicated the same tendency, as shown in Fig. 1. A two-way repeated measure analysis of variance (ANOVA) was conducted on the asynchronies with the modality and distractor variables. We did not find any significant effects of

Discussion

Our finding of larger standard deviations in the visual than in the auditory target without-distractor condition is consistent with previous studies (Bartlett and Bartlett, 1959, Dunlap, 1910, Klemmer, 1967, Repp, 2003b, Repp and Penel, 2002, Repp and Penel, 2004). Apparently, the results reflect the fact that audition is more accurate in temporal resolution than is vision.

We observed that large perturbations in the distractor sequence increased the standard deviation of the tap-target

Conclusion

People often find it easier to tap in synchrony with auditory pacing signals than with visual pacing signals. In fact, tapping accuracy is greater if pacing signals are presented aurally rather than visually. Previous studies have argued that the error correction process in timing shows auditory dominance, which results in the auditory dominance in tapping behavior. We clearly demonstrated the validity of this argument under the condition of no-AV integration; our study also indicated that the

Subjects

Six female college students between 20 and 22 years of age participated in the experiment. The subjects were paid for their participation. All were right-handed and had normal or corrected-to-normal vision and normal hearing. They were naive as to the purpose of the experiment. They gave informed consent, following an explanation of the experimental procedures.

Apparatus and stimuli

The experiment was conducted in a quiet room [background noise: 32 dB(A)] with lighting intensity of 400 lux. Auditory and visual

Acknowledgments

We are grateful to Bruno Repp for his invaluable help in improving the manuscript and to Gisa Aschersleben for her informative comments. We also thank Norihiro Sadato for helpful comments on earlier drafts.

References (34)

  • N. Bartlett et al.

    Synchronization of a motor response with an anticipated sensory event

    Psychol. Rev.

    (1959)
  • R. Bermant et al.

    The effect of the degree of visual–auditory stimulus separation upon the spatial interaction of vision and audition

    Percept. Mot. Skills

    (1976)
  • P. Bertelson et al.

    Cross-modal bias and perceptual fusion with auditory–visual spatial discordance

    Percept. Psychol.

    (1981)
  • J. Bhattacharya et al.

    Sound-induced illusory flash perception: role of gamma band responses

    Neuro. Rep.

    (2002)
  • K. Dunlap

    Reactions to rhythmic stimuli, with attempt to synchronize

    Psychol. Rev.

    (1910)
  • R. Fendrich et al.

    The temporal cross-capture of audition and vision

    Percept. Psychophys.

    (2001)
  • W. Fujisaki et al.

    Temporal frequency characteristics of synchrony–asynchrony discrimination of audio-visual signals

    Exp. Brain Res.

    (2005)
  • Cited by (35)

    • Dynamic modulation of cortico-muscular coupling during real and imagined sensorimotor synchronisation

      2021, NeuroImage
      Citation Excerpt :

      When there are multiple stimulus streams, selective neural tracking might facilitate the processing of the relevant stimulus stream by aligning the oscillatory phase corresponding to high cortical excitability with the relevant stimuli (te Woerd et al., 2018), which could transfer to cortico-muscular coupling. In settings where competing auditory and visual temporal information is present together, such selective neural tracking has been found to be better for auditory rhythms than visual rhythms (Kato and Konishi, 2006; Repp and Penel, 2002; 2004). Auditory rhythms have been shown to allow better movement synchronisation than visual rhythms due to higher temporal resolution of the auditory system (Elliott et al., 2010; Hove et al., 2013; Patel et al., 2005; Repp, 2003), although this auditory facilitation appears restricted to discrete stimuli, as indicated by similar synchronisation performance when visual rhythms contain more real-life-like continuous motion (e.g., Gan et al., 2015; Hove et al., 2013; Hove and Keller, 2010; Varlet et al., 2012).

    • Audiovisual integration increases the intentional step synchronization of side-by-side walkers

      2017, Human Movement Science
      Citation Excerpt :

      For V-Shifts, s.d. increased remarkably slower and did not decrease. These patterns are similar to findings of previous studies about finger-metronome synchronization using target-distractor paradigms (Bertelson & Aschersleben, 2003; Repp & Keller, 2004; Kato & Konishi, 2006; Hove et al., 2013). The asymmetric distractor effect could be attributed to a superior ability of the auditory system to extract temporal structure from isochronous stimulus sequences (Grahn, Henry, & McAuley, 2011; Su, 2014).

    • Walking to a multisensory beat

      2017, Brain and Cognition
      Citation Excerpt :

      In sum, clinical studies in patient populations point to effects of multisensory integration in gait with potential benefits for rehabilitation. Additionally, auditory dominance is an oft-reported phenomenon which is likely to affect the synchronization of continuous movements with multimodal stimuli (Aschersleben, 2002; Kato & Konishi, 2006; Repp & Penel, 2002, 2004; Roy et al., 2017). The human sensorimotor system exhibits a strong preference to bind movement to the auditory sensory modality when visual and tactile rhythmic stimuli are also available in a synchronization task (tapping or bimanual coordination).

    • Visual enhancement of auditory beat perception across auditory interference levels

      2014, Brain and Cognition
      Citation Excerpt :

      It is commonly found that the capacity for rhythm and beat perception differs between the two major senses, audition and vision. When the same task was presented through comparable auditory (typically successive tones) and visual stimuli (until recently, repetitive flashes), performance in the former well surpasses that in the latter with regard to rhythmic interval timing (Grondin & McAuley, 2009), beat perception (Grahn, 2012; Grahn, Henry, & McAuley, 2011; McAuley & Henry, 2010), and sensorimotor synchronization (Jäncke, 2000; Kato & Konishi, 2006; Lorås, Sigmundsson, Talcott, Öhberg, & Stensdotter, 2012; Patel et al., 2005; Repp, 2003; see Section 1.4.2 in Repp & Su, 2013). Similarly, when pitting concurrent auditory and visual rhythms in a target–distractor paradigm (Guttman, Gilroy, & Blake, 2005; Repp & Penel, 2002; Repp & Penel, 2004), an auditory distractor affects the visual task considerably, whereas a visual distractor has minimal influence on the auditory task.

    • Preservation of perceptual integration improves temporal stability of bimanual coordination in the elderly: An evidence of age-related brain plasticity

      2014, Behavioural Brain Research
      Citation Excerpt :

      This hypothesis is corroborated by our EEG results revealing no difference in neural activations related to perceptual processes (i.e. frontal gamma synchronization) during auditory and audio-visual stimulations (bottom panel of Fig. 4). Another hypothesis is that auditory stimulations are enough to improve temporal components of rhythmic motor behaviour [17,54]. Previous studies exploring how periodic sensory information impacts continuous bimanual coordination showed that synchronizing particular points of a movement cycle with environmental stimulations can improve both spatial and temporal stability of coordination [55,56].

    • Audiovisual beat induction in complex auditory rhythms: Point-light figure movement as an effective visual beat

      2014, Acta Psychologica
      Citation Excerpt :

      Whereas a visual beat extracted from repetitive flashes fails to influence beat perception of subsequent auditory rhythms (Grahn et al., 2011), the present result offers support for the capacity of a visually communicated humanlike movement to induce a beat cross-modally. The result of a dominant visual beat also stands in contrast to a substantial line of research that demonstrates a visual disadvantage in interval timing (Grondin & McAuley, 2009), in rhythm and beat perception (Grahn, 2012; Grahn et al., 2011; Guttman, Gilroy, & Blake, 2005; McAuley & Henry, 2010) and in sensorimotor synchronization (Kato & Konishi, 2006; Patel et al., 2005; Repp, 2003; Repp & Penel, 2002, 2004), as compared to the auditory counterpart. Indeed, the visual rhythms in most of these studies were presented as repetitive flashes devoid of motion information that is crucial for conveying visual rhythmicity (Brandon & Saffran, 2011; Hove et al., 2010).

    View all citing articles on Scopus
    View full text