Abstract
The debate on the neural basis of multitasking costs evolves around neural overlap between concurrently performed tasks. Recent evidence suggests that training-related reductions in representational overlap in fronto-parietal brain regions predict multitasking improvements. Cognitive theories assume that overlap of task representations may lead to unintended information exchange between tasks (i.e., crosstalk). Modality-based crosstalk was suggested as a source for multitasking costs in multisensory settings. Robust findings of increased costs for certain modality mappings may be explained by crosstalk between the stimulus modality in one task and sensory action consequences in the concurrently performed task. Whether modality-based crosstalk emerges from representational overlap in general fronto-parietal multitasking regions or modality-specific regions is not known yet. In this functional neuroimaging study, we investigate neural overlap during multitasking performance in humans, focusing on modality compatibility by employing multivariate pattern analysis and modality-specific practice interventions in three groups (total N = 54, 24 females). We observed significant differences between modality compatible and modality incompatible single-task representations, specifically in the auditory cortex but not in fronto-parietal regions. Notably, improved auditory decoding accuracy related to modality incompatible tasks was predictive of performance gains in the corresponding dual task along with complete elimination of modality-specific dual-task costs. This predictive relationship was evident only in the group practicing modality incompatible mappings, suggesting that specific practice on task sets with modality overlap influenced both neural representations and subsequent multitasking performance. This study contributes to the integration of cognitive theory and neuroscience and the role of task representations in dual-task interference.
Significance Statement
In a society dominated by multitasking, understanding its neurocognitive basis and plasticity is crucial for key aspects of everyday tasks. We investigate the neural mechanisms behind multitasking limitations, offering insights for targeted cognitive interventions. The study builds upon established theories of cognitive multitasking and imaging research, addressing the concept of modality-based crosstalk—the unintended exchange of modality-based information between tasks. Through functional brain imaging and pattern analysis, we examined how neural task representations contribute to performance costs in dual tasks with varying degrees of modality overlap. Notably, our findings demonstrate a practice-related decrease in neural overlap which is associated with substantial multitasking improvements, specifically in the auditory cortex, emphasizing the contribution of sensory regions to flexible multidimensional task representations.
Introduction
Human limitations in multitasking are significant and can lead to safety-relevant consequences in everyday life, for example, when using a mobile phone while driving. A long-held debate relates to the question of whether performance costs in multitasking emerge based on the neural overlap of concurrently performed tasks (Klingberg, 1998; Just et al., 2001). Recent theories focusing on fundamental computational dilemmas (i.e., sharing vs. separation of neural representations) support the idea that representational overlap constrains human multitasking (Badre et al., 2021; Musslick and Cohen, 2021; Garner and Dux, 2023), consistent with evidence from a multivariate imaging study (Garner and Dux, 2015). This study revealed that multitasking training reduces the overlap of concurrent task representations in fronto-parietal brain regions which predicts training improvements in multitasking (Garner and Dux, 2015).
The overlap of task representations can result in the non-intentional exchange of information between tasks, called central crosstalk (Navon and Miller, 1987; Logan and Gordon, 2001; Koch, 2009; Janczyk et al., 2014). Crosstalk may lead to between-task benefits or interference at different task levels, which is supported by behavioral and neural research (Lien and Proctor, 2002; Halvorson and Hazeltine, 2015; Koch et al., 2018; Paas Oliveros et al., 2023). Recently, modality-based crosstalk was suggested to underlie increased multitasking costs when comparing dual tasks with different modality mappings (Hazeltine et al., 2006; Schacherer and Hazeltine, 2020). For example, visual-manual and auditory-vocal (i.e., modality compatible) modality mappings produce consistently lower dual-task costs than visual-vocal and auditory-manual (i.e., modality incompatible) mappings (Fig. 1B) (Hazeltine et al., 2006; Stelzel et al., 2006; Göthe et al., 2016). Modality-based crosstalk refers to interference between the stimulus modality in one task (i.e., auditory stimulus) and sensory action consequences (i.e., auditory action effect of vocal response) in the concurrently performed task, while the stimuli and response modalities do not overlap between tasks.
So far, the modality-based crosstalk assumption has been supported primarily by behavioral research (Hazeltine et al., 2006; Göthe et al., 2016; Schacherer and Hazeltine, 2020, 2021). It remains unclear how modality-based crosstalk evolves at the neural level and how this is affected by multitasking practice. Specifically, it is unknown whether representational overlap is present in general multitasking-related brain regions (i.e., fronto-parietal regions, as identified by previous research on multitasking training Garner and Dux, 2015 and by a meta-analysis Worringer et al., 2019) or rather in modality-specific sensory brain regions.
Assuming that crosstalk of modality-specific task features most likely involves modality-specific sensory regions as compared to supramodal regions (Garner and Dux, 2015), we hypothesized higher neural overlap between tasks with a modality incompatible mapping in sensory brain regions related to the sensory response-related action effects. Here, we used multivariate pattern analysis (MVPA) to decode single-task representations in functional magnetic resonance imaging (fMRI) data to test this.
Additionally, we expect that the degree of modality overlap contributes to multitasking performance. Based on the training literature (Garner and Dux, 2015), we predicted a practice-related decrease of neural overlap in sensory regions, specifically for modality incompatible tasks. Participants completed single and dual tasks during fMRI measurements before and after a dual-task practice intervention. The sample was randomly split into three practice-intervention groups (one per modality mapping and a passive control group), two completing the same dual tasks as during fMRI measurements for 80 min (Fig. 1A).
Previewing our results, we replicate the elimination of the substantial difference in behavioral dual-task costs between modality mappings after practicing the modality incompatible mapping (Mueckstein et al., 2022). We found a significant difference in decoding accuracy between the modality incompatible compared to the modality compatible mappings in the auditory region of interest. This supports the assumption of differences in representational overlap between modality mappings, thus extending previous behavioral and neural findings in the field. Additionally, only for participants completing the modality incompatible practice, a selective decrease in the neural overlap in the auditory region between the modality incompatible tasks was positively associated with individual performance improvements.
Methods
This study was pre-registered prior to data analyses (https://osf.io/whpz8). Accordingly, sections in the methods are mostly copied from the preregistration and shortened. We explicitly report any deviations.
Participants
The total sample of this study consisted of 71 healthy right-handed adults aged 18 to 30 years with German as their first language (or comparable level) and normal or corrected-to-normal vision. Exclusion criteria were any neurological or psychiatric diseases, current medical conditions that could potentially influence brain functions, past or present substance abuse (alcohol and drugs), a self-reported weakness in distinguishing left and right, and common contraindications for MRI scanning. Participants were excluded from the specific analysis if their head movement exceeded the threshold of 25% volumes with framewise displacement > 0.4 mm or if they committed more than 30% errors per run in more than three single task runs (one localizer run), or if the error rate during the practice intervention was higher than 50%. For the dual-task performance, due to the high error rate, we deviated from the pre-registration protocol and limited the 30% criteria to trials in which both stimuli were presented on the same side (i.e., congruent trials, averaged for both modality mappings) to ensure that participants were still on task, as incongruence of stimulus information (i.e., stimuli are presented on different sides) between tasks enhanced task difficulty in addition to modality compatibility (error rate for congruent stimuli: M = 17.25, SD = 14.90, incongruent stimuli: M = 44.92, SD = 30.04). An overview of the specific exclusion numbers and reasons for each analysis can be found on OSF (https://osf.io/5xbd3). All three groups were very similar in age and gender distribution (∼50% female) (see Table 1). All participants gave their written informed consent before the first session of the study and could choose between 60 € or course credit for reimbursement after completing all sessions. The ethics committee of the Freie Universität approved the study following the Declaration of Helsinki.
Experimental overview
Participants completed three sessions; the first session was held online and included behavioral and cognitive measures which will be reported elsewhere. A detailed description of this first session can be found in the preregistration of another project (https://osf.io/nfpqv). The remaining two sessions (each 2.5–3 h) took place at the Cognitive Center for Neuroscience Berlin (CCNB). During the second session (Session 1 in Fig. 1A), participants started outside the scanner with a short familiarization of the tasks (256 trials, 32 per single task, 64 per dual task) before they completed the session in the MRI. Participants returned to the CCNB for the final session (Session 2 in Fig. 1A) after a minimum of five and a maximum of nine days at the same time during the day. They repeated the shortened familiarization (128 trials, 16 per single task, 32 per dual task) and continued with the group-specific practice intervention for 80 min before they finished the study with the Post part in the scanner.
Behavioral tasks
Participants performed sensorimotor choice reaction tasks, either as single or dual tasks, with varying modality mappings (modality compatible or modality incompatible, compare Fig. 1B). In the visual domain, the presented stimuli were a white square (pixel size 56.8 × 56.8) on a black background at six different positions (top, center, bottom), three on the right side of a white fixation cross (pixel size 41.1 × 41.1, thickness 9.9 pixels) and three on the left. In the auditory domain, stimuli were pure tones in three different frequencies (200, 450, and 900 Hz), presented on either the right or the left ear. In the dual-task blocks, stimuli were presented simultaneously (SOA = 0 ms). Participants had to respond to the side of the stimuli by either pressing a button with their right or left hand (index finger) and/or by saying the German word for “right” and “left”. The pairing of stimulus and response modality determines the modality mapping. The combination of visual-manual and auditory-vocal is considered modality compatible and the combination of visual-vocal and auditory-manual modality incompatible. Consequently, there was no overlap in either response or stimulus modality within each dual-task condition.
Additionally, we manipulated the task difficulty in the single-task runs by adding visual noise to the stimuli, increasing the distance between the fixation cross and the stimulus, and reducing the contrast between the stimulus and background. For the auditory stimulus, we also added white noise and reduced the volume of the tone compared to the noise. We only included the easy blocks for the behavioral analysis and used the difficulty manipulation as a control analysis for the MVPA. Stimulus material is provided online (https://osf.io/w9hsu/). We randomized per participant the order of the dual-task runs (first modality compatible vs. first modality incompatible mapping) and the block position for the single tasks for each run. Within each block, each stimulus was presented equally often and in random order. To prevent a systematic confound from the visual appearance of the task instruction that was shown at the beginning of each block, we showed task instructions as either a small picture or a text, with different pictures and different fonts for each block to prevent any repetition in visual appearance. Deviating from the Preregistration, we decided against the separate analysis for reaction times and accuracy rates but used an integrated balanced score BIS (Liesefeld and Janczyk, 2019). The combined BIS parameter has the advantage of controlling for a potential speed-accuracy trade-off, as shown by Liesefeld and Janczyk (2019). Individuals might prefer different strategies, focusing either more on accuracy or more on speed. Using a combined parameter accounts for those potential differences. Additionally, analyzing only one parameter further increases the statistical power, and reduces the complexity thus increasing clarity of the analyses, compared to a separate analysis of reaction time and accuracy. The BIS parameter is calculated as the difference between z-standardized reaction times and accuracies.
All statistical analysis and plotting was done in R [version 4.2.2; R Core Team (2020)] with RStudio [version 2023.12.1; RStudio Team (2019)] and the tidyverse package [version 2.0.0; Wickham et al. (2019]. The manuscript was created with the papaja package [version 0.1.2; Aust and Barth (2023)].
Practice intervention
The practice intervention was completed outside the scanner and consisted only of dual-task trials, using the same stimuli, responses, and presentation timing as for the Pre- and Post measurement (compare the previous section Behavioral Tasks). For the compatible intervention group, only modality compatible dual-task trials were presented, and for the incompatible intervention group, only modality incompatible dual-task trials. Participants worked on seven runs, each consisting of four blocks with 64 trials per block (a total of 1792 trials). After each run, participants were asked about their subjective feelings in terms of focus, motivation, fatigue, and frustration. After completing the intervention, participants entered the scanner and completed another run of the intervention tasks (total of 256 trials), inside the scanner without measuring the brain activity. Participants in the passive control group paused for 80 min, meaning they were instructed to not engage in cognitively demanding tasks but were otherwise free to do what they liked. After the break, they started immediately with the Post part in the scanner.
MRI session
The first fMRI session consisted of a resting state scan and twelve task runs in a block design. Participants started with a 10-min resting state scan with eyes open in the scanner, followed by two runs of a localizer task, used to define the regions of interest (ROI), each including single and dual tasks in both modality mappings. Each run of the localizer tasks contained six blocks, while each block included 16 trials. In the Pre part the first two runs contained only dual-task trials, and each run was assigned to one modality mapping, consisting of 128 trials. The remaining eight runs contained only single-task trials in both modality mappings, with an easy and a difficult version of the tasks. In each run, every task, modality mapping, and difficulty combination occurred only once, resulting in eight blocks with 16 trials per block. All stimuli were presented for 200 ms, followed by a response interval of 1500 ms and an inter-stimulus interval of 200 ms. Each run concluded with an 8 s fixation period. The session lasted about 2.5 h. The Post fMRI session after the practice intervention was the same as the Pre session, starting with the two dual-task runs, followed by eight single-task runs. The session lasted about 3 h.
MRI data acquisition
Due to a scanner upgrade at the imaging center, the data was acquired with two different scanners. For both scanners, the same head coil and parameters were used. Each participant completed both sessions in the same scanner. The first 25 participants (10× passive intervention, 6× modality compatible intervention, 6x modality incompatible intervention, 3× only Pre-measurement) were measured with Siemens Magnetom TIM TRIO syngo 3T and the remaining participants with Siemens Magnetom 3.0T Prisma both with a 32-channel head coil. At the end of the first session, a high-resolution T1-weighted structural image was measured with 176 interleaved slices, 1 mm isotropic voxels; TE = 2.52 ms, TR = 1900 ms, FoV = 256 × 256 × 176 mm. Functional runs consisted of 139 whole-brain echo-planar images of 37 interleaved slices for the localizer task and the dual-task runs, and 183 whole-brain echo-planar images for each single task run. Each functional run was acquired with 3 mm isotropic voxels, TE = 30 ms, TR = 2000 ms, flip angle = 75°, FoV =192 × 192 × 133 mm. After each dual-task run, a gray-field mapping was measured (3 mm isotropic voxel, TE1 = 4.92 ms and TE2 = 7.38 ms; TR = 400 ms; FoV = 192 × 192 × 133 mm, flip angle = 60°). Participants received auditory stimuli via MRI-compatible headphones (Sensi-Metrics S14, SensiMetrics, USA). Visual stimuli were projected on a screen at the end of the bore, which participants could view through a mirror attached to the head coil. Their vocal responses were recorded via an MRI-compatible microphone (OptimicTM MEG, Optoacoustics, Israel) and the manual responses via MRI-compatible 4-button bimanual boxes (HHSC-2 × 2, Current Designs, USA).
MRI univariate data analysis and ROI definition
Data was converted into BIDS format using dmc2bids (Version 2.1.6; Boré et al., 2023) and preprocessed using fMRIprep [Version 21.0.2; Esteban et al. (2019)], including 3D motion correction and slice-time correction. BIDS transformed raw data was uploaded to OpenNeuro, including a tsv file indicating the scanner type for each participant (https://doi.org/10.18112/openneuro.ds005038.v1.0.1). All functional data were aligned to a generated reference image, co-registrated, and transformed to standard space. Anatomical T1-weighted data were resembled into standard MNI space. For more details, please see the generated output script provided by fMRIprep in the preregistration. BOLD runs of the localizer task were smoothed with SPM12 and 8 mm FWHM Gaussian. We used SPM12 to conduct the first-level analysis on all normalized BOLD runs using a block design and a general linear model separately for the localizer runs and the single task runs, the latter also separate for each timepoint (Pre and Post-timepoint). In the localizer model, we included six motion parameters (3× rotation, 3× translation) and the combined measure for head movement framewise displacement as regressors of no interest. For each participant, statistical parametrical maps with contrasts between the stimulus modalities (visual vs. auditory), response modalities (vocal vs. manual), and single vs. dual-task were generated. For the group analysis, the individual maps were averaged and voxel-wise tested with a one-sample t-test between the defined contrasts. A cluster-wise FWE-corrected significant threshold (p = .05) on the voxel level was used. Note that we restricted the cluster selection to the frontal lobe for the single vs. dual-task contrast, as previous dual-task studies (Schubert and Szameitat, 2003; Stelzel et al., 2006; Worringer et al., 2019) consistently showed frontoparietal activity when contrasting dual and single tasks. As the focus of the ROI analysis was on the sensory regions, we selected only one cluster in the frontoparietal, namely the highest in the lateral frontal cortex. Additionally, we added the fronto-parietal-subcortical cubic ROIs defined in a previous study on multitasking training (Garner and Dux, 2015) to compare our task-specific ROIs with those task-independent ones. Beta-images used to define the group-based activity clusters were uploaded to Neurovault (https://identifiers.org/neurovault.collection:16842).
The resulting highest activation clusters for each contrast (visual, auditory, manual, vocal, and frontal) were used to define the ROI . The clusters served as boundaries to determine the group-based activation peak. This voxel was defined as the center for a 10 mm sphere (compare Table 2 for peak coordinates). A group-based sphere for each contrast was defined as well. We included post hoc an individual differences approach and also identified individual peaks within the group clusters. We are aware that the sample size is not ideal for an individual differences approach but included several control measurements (i.e., compare results on individual level with group sphere and group cluster) to ensure the robustness of the findings). In preparation for the MVPA analysis, for the single-task model, we included only regressors of each single-task combination (visual-manual, visual-vocal, auditory-manual, auditory-vocal, each in both difficulty levels) but without motion as a regressor.
Multivoxel pattern analysis
After performing the first-level analysis on the single-task runs, we submitted the resulting subject-specific beta images, per run, single-task combination and based on the block-design to the Decoding Toolbox (Hebart et al., 2014) to create individual decoding maps for each modality mapping. We used the default methods in the Decoding Toolbox, the support vector machines (SVMs) as a decoding method, a leave-one-run-out-cross-classification, and an ROI analysis. This resulted in one decoding accuracy value for each ROI (auditory, visual, vocal, manual, frontal) and each modality mapping (modality compatible and modality incompatible) with a chance level of 50%. We compared our results with the spheric ROI defined by the group activation and with the whole activation cluster as ROI to ensure that our results are not dependent on the small spheres (compare Fig. 3). We also included a whole brain searchlight (radius 11 mm) to rule out the option that other brain regions not identified in the univariate analysis contained information about the modality mapping. This was not the case. Only auditory regions were significantly different for the two modality mappings (Fig. 3). To further ensure that our results were not influenced by differences in task difficulty and type of instruction (instruction as text or image), we employed a cross-classification on the different difficulty levels and types of instructions, respectively. Specifically, we trained the classifiers to decode between visual-manual-easy vs. auditory-vocal-difficult single tasks and tested on visual-manual-difficult vs. auditory-vocal-easy (respectively for the modality incompatible mapping). This procedure eliminates the influence of task difficulty on the decoding accuracy between the two tasks. In both analyses, the difference between the modality mappings in the auditory regions was still significant (pairwise t-test, corrected with the Benjamini–Hochberg procedure Benjamini and Hochberg, 1995). Those results ruled out the explanation that the classifier only differentiated between difficulty or instruction type.
Results
Influence of modality overlap on behavior and practice-related changes
We assessed behavioral performance as a balanced integration score (BIS) of reaction times and accuracies (Liesefeld and Janczyk, 2019) (see Table 3 for reaction times and error rates. Please find the statistics and graphs on the following document https://osf.io/tahds), to account for the dependence of the two parameters and to have a single parameter for the following correlational analyses with the neural decoding parameter. Dual-task costs were calculated as the difference between single and dual tasks, in which higher values of the BIS parameter indicate higher dual-task costs (i.e., higher reaction times and error rates in dual tasks than in single tasks).
At baseline, all three groups showed a robust behavioral effect of modality mapping, F(1, 54) = 178.92, p < .001,
Neural overlap of task representations
We investigated the overlap of task representations at baseline in task-relevant regions with MVPA on subject-specific beta images from first-level single task analysis. We trained linear SVMs using a leave-one-run-out cross-classification (implemented in The decoding toolbox Hebart et al., 2014) to distinguish between the two single tasks from fMRI activity pattern in the modality compatible and in the modality incompatible mapping, respectively (compare Fig. 1B). Task-relevant regions were defined by task-related univariate clusters in two separate localizer runs in which participants performed the same single and dual tasks as in the main experiment in a block design (Fig. 1A. We contrasted the input modalities (visual vs. auditory), the output modalities (manual vs. vocal), and single vs. dual tasks (frontal region as multitasking specific region) to create five task-relevant clusters per hemispheres (gray cluster in Fig. 2A) as a basis for the ROI analysis. Within each cluster, we defined a sphere (10 mm radius) centered at the maximum voxel on the group level (orange sphere in Fig. 2A and peak coordinates in Table 2).
Our results demonstrate that the trained classifiers can robustly distinguish between the two single tasks (i.e., visual-manual vs. auditory-vocal and visual-vocal vs auditory-manual, respectively) in both modality mappings in all the task-relevant regions, decoding accuracies for all ROIs are significantly above chance level (50%), all corrected t-tests p < .001. Note that using the t-test on decoding accuracies leads to a different interpretation than the comparison of standard brain activity. Applying it to decoding accuracies, a significant t-test shows, similar to a fixed-effect analysis, that there is an effect in at least one person and does not allow the inference that the effect is present in the population (Allefeld et al., 2016).
Remarkably, a pairwise t-test, corrected for multiple comparisons, revealed a significant difference between the representation of the modality mappings only in the auditory region, t(68) = 3.64, p = .003. Here, decoding accuracies were higher for modality compatible tasks (
To rule out that the results are only due to our ROI selection, we ran the same analysis in the whole group cluster (Fig. 3A), using a searchlight analysis on the whole brain (Fig. 4) and tested for potential influences of task difficulty (Fig. 3B) and task instruction (Fig. 3C). All analyses confirmed the difference in the auditory region, except on the cluster level where the difference is only numerical (p = .34). Accordingly, the results in the following results sections will focus on this auditory region, which selectively differentiates between the modality compatible and modality incompatible mappings. We further investigated whether the higher neural overlap at baseline is also associated with behavioral performance. Surprisingly, we found no significant correlation between the decoding accuracy and the dual-task costs at baseline, r[ − 0.15, 0.03], p > 0.27. This could be due to two points: Firstly, there might be low reliability of estimates for the behavioral performance in the Pre session. Secondly, the strength of decoding alone might have no primary relevance for the behavior.
In sum, these findings demonstrate that the neural task representations of the two single tasks in the modality incompatible mapping overlap more in the auditory regions than in the modality compatible mapping, supporting the assumed difference in sensory task representation for modality compatible and modality incompatible tasks. For the first time, we provide evidence that the theoretical overlap of stimulus and action-effect modalities is also represented in neural single-task representations in sensory regions instead of general multitasking-related fronto-parietal regions.
Practice-related changes of neural task representations and their relation to multitasking performance
To further substantiate the role of sensory neural overlap for multitasking performance in modality mappings, we examined whether the difference in overlap in the auditory region changes with practice and if this change is related to behavioral change. As the functional organization of the brain is highly variable between individuals, we here used the individual maximum voxel within each group cluster in the localizer task to define individual spheric ROIs for these pre–post comparisons (Fig. 2B). Performance gains and changes in decoding accuracy were defined as the difference between pre and post-timepoint; higher values indicate a performance gain and an increase in decoding accuracy after the practice intervention, respectively.
We did not find a significant effect of time point in the task representation after the practice intervention for any practice group, (main effect timepoint, F(1, 60) = 0.01, p = .928,
This result confirmed that the significant correlation is not due to differences in single-task performance or head movement. Accordingly, the degree of separation of task representations in auditory regions after practicing a modality incompatible mapping for one session can be considered an important predictor for the elimination of modality-specific dual-task interference within a given session.
Discussion
While multimodality is a typical characteristic of most everyday multitasking situations, little is known about modality-specific multitasking costs, going beyond attentional or motor limitations. Here, we investigated how modality-based crosstalk between action-effect modality and stimulus modality (Schacherer and Hazeltine, 2020) evolves on a neural level. Specifically, we examined whether modality-specific neural overlap is coded in general multitasking-related brain regions or modality-specific sensory regions. We further elucidated how it affects multitasking performance, and how practice changes those representations. In line with the modality-based crosstalk assumption (Hazeltine et al., 2006; Schacherer and Hazeltine, 2020), we found a significant difference between modality compatible and modality incompatible single-task representations in modality-specific sensory brain regions (i.e., auditory cortex) and not in multitasking-related regions in frontal and parietal cortex. In addition, practice-related improvements in modality incompatible decoding accuracy were associated with performance gains in the modality incompatible dual task. Individuals who succeeded most in reducing modality-specific dual-task costs were those with the greatest sensory separation, supporting the assumed relevance of sensory representations for multitasking performance. This effect was only present for the group who practiced the modality incompatible mapping during the intervention, suggesting the build-up of highly specific task representations to deal with potential crosstalk.
For the first time, we provide evidence from neural data for the relation between dual-task crosstalk and sensory modalities, specifically in the auditory cortex. This complements previous findings which revealed multitasking-training-related changes of representational overlap in fronto-parietal regions (Garner and Dux, 2015). While it has been discussed that sharing representations may be advantageous for rapid learning and generalization, sharing also facilitates interference and crosstalk, which is reduced by segregation but at the cost of reduced generalizability to other task contexts (Musslick and Cohen, 2021; Garner and Dux, 2023). In their study, Garner and Dux (2015) addressed the representational basis of multitasking costs per se by exclusively investigating dual-task in relation to single-task performance. In our study, in contrast, we directly compared dual tasks with different degrees of modality overlap, addressing specifically the basis of modality-based crosstalk in the context of modality compatibility (Hazeltine et al., 2006; Stelzel et al., 2006). While reducing representational overlap in fronto-parietal cortex may improve the general ability to process two tasks simultaneously (Garner and Dux, 2015, 2023), reducing representational overlap in modality-specific regions seems to reduce modality-specific sources of multitasking-costs such as modality-based crosstalk.
An alternative account for the emergence of the modality-compatibility effect is simply the slower routing of information for modality incompatible mappings (e.g., Greenwald, 1970; Wang and Proctor, 1996) - non-preferred processing routes (e.g., auditory-manual, visual-vocal) may simply be slower and thus lead to greater dual-task interference. However, studies consistently report no difference between single tasks (e.g., Stelzel et al., 2006; Göthe et al., 2016) and persistent dual-task effects when single-task performance is explicitly matched (e.g., Experiment 3 in Hazeltine et al., 2006). In our data, the modality compatible single tasks were even slightly slower than modality incompatible single tasks while still showing the robust effect of modality compatibility in dual-task trials. Thus, on a behavioral level, the difference in dual-task costs can not be attributed solely to differences in single tasks, favoring the modality-based crosstalk account including the role of action effects. This is also in line with Schacherer and Hazeltine (2023) who tested specifically different explanations for modality-specific dual-task costs by manipulating the action effects of an auditory-manual task which was paired with a visual-manual task. They showed that adding auditory action effects, which do not interfere or overlap with the second response to the visual task, led to decreased dual-task costs without a change in single-task performance (Experiment 2), concluding that their results are best aligned with the crosstalk assumption. The decoding results of our study, where we did not artificially manipulate action effects but referred to the expected and learned sensory consequences of actions, can also not be explained by a (non-)preferred routing. This account does not explain why the degree of neural overlap in auditory regions is related to modality-specific dual-task costs. Further studies might shed light on the contribution of non-preferred routing between stimulus and response modalities by applying connectivity approaches to test whether stronger connections between regions involved in visual-manual and auditory-vocal tasks compared to visual-vocal and auditory-manual tasks are related to dual-task costs.
The role of sensory and motor regions for the representation of stimuli and/or responses and fronto-parietal cortex for representing task rules have been studied extensively (see review by Woolgar et al., 2016). Our data support the importance of the fronto-parietal cortex to distinguish between single tasks (significant above-chance decoding), but the decoding seems to be independent of the modality mappings. Previous dual-task research on the role of the fronto-parietal cortex was mostly focused on processes involved in dual-tasking, applying mainly univariate analyses (see review Worringer et al., 2019). Integrating these findings into representational approaches will be a challenge for future studies in the field. Likewise, asymmetries in modality-specific representations need to be considered in more detail. ln the present study, modality-specific effects were exclusively present in the auditory cortex without comparable effects in visual regions. One potential explanation for the significance of auditory brain regions associated with anticipated action effects of vocal responses is provided by the forward model for self-initiated movements. Following this model, sensory cortices receive a copy of a motor command (i.e., efference copy) while planning a movement and its effects (i.e., action effect) (Holst and Mittelstaedt, 1950; Wolpert, 1997; Ody et al., 2023). Several studies demonstrated high modulation of the auditory cortex as a consequence of speech. More specifically, already the anticipated auditory signal from speech changes activity in the auditory cortex when actually hearing it (Ford et al., 2005; Heinks-Maldonado et al., 2005; Niziolek et al., 2013). In contrast, Straube and colleagues used self-generated button presses and manipulated the multisensory consequences (visual and auditory). They provided evidence that a button press leads to a general preparation to process any following stimulus, irrelevant of the modality. In other words, the expected sensory outcome of a manual button press seems to be broader than the auditory action effect of a vocal response. This is also reflected in everyday experiences where pressing a button can result in a visual effect (i.e., turning the light on and off) or in an acoustic effect (i.e., pressing the doorbell). Whereas speech always results in an auditory effect. It might be that the action effects of a button press are more distributed over the cortex and thus more difficult to decode in sensory regions. On the other hand, the specific action effect of speech can reliably be decoded in the auditory cortex.
Importantly, action effects are a matter of learning. According to the ideomotor theory, the process of action selection is based on the sensory effects of this action, suggesting that there is a bidirectional connection between the action and the action effects (Greenwald, 1970; James, 1890; Prinz, 1997). Several studies provided evidence that the association between action and the action effect is not necessarily hard-wired but can be learned and thus affect task performance (Kühn et al., 2010; Schacherer and Hazeltine, 2021, 2023). For example, Kühn and colleagues established images of faces as an artificial action effect of one button press and images of houses for another button press. After this practice phase, the button press was sufficient to activate the neural representations of the previously paired types of images without actually presenting them (Kühn et al., 2010). They concluded that not only action effects guide action selection, but also an action itself activates a corresponding perceptual representation.
Our study provides additional evidence that the connection between action, action effect, and stimulus is relevant for task performance and can be changed, even during a comparatively short practice intervention. Consequently, we assume that in the modality incompatible intervention group, participants have learned to transiently overwrite highly learned modality-specific associations presumably by suppressing the interfering action effect representation and/or by building up a new one. This led to increased performance after the practice intervention, associated with better decodability between single-task representations and decreased performance for the non-practiced modality compatible mapping. Future studies may address the dynamics of suppression and/or building a new association in more detail, and explore further how stable those associations are across time.
Taken together, we could provide evidence that not only fronto-parietal regions but also sensory regions hold information about task representations, including action effects, which may be subject to crosstalk in a multisensory multitasking context. These findings reveal for the first time in humans that the neural representation of tasks in a multimodal setting is malleable through multitasking practice at the individual level.
Footnotes
We thank Elisa Arnold, Friederike Glueck, Gregory Gutmann, Lea Lowak, Max Nowaczyk and Oliver Stegmann for assisting in data collection and preprocessing of the vocal data. Neuroimaging was performed at the Cognitive Center for Neuroscience Berlin and was technically supported by Christian Kainz and Till Nierhaus. This work was financially supported by the German Research Foundation, Priority Program SPP 1772 [grant numbers: STE 2226/4-2; GR 3997/4-2; HE 7464/1-2; RA 1047/4-2].
The authors declare no competing financial interests.
- Correspondence should be addressed to Marie Mueckstein at mariemueckstein{at}gmail.com.