Skip to main content

Main menu

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Collections
    • Podcast
  • ALERTS
  • FOR AUTHORS
    • Information for Authors
    • Fees
    • Journal Clubs
    • eLetters
    • Submit
  • EDITORIAL BOARD
  • ABOUT
    • Overview
    • Advertise
    • For the Media
    • Rights and Permissions
    • Privacy Policy
    • Feedback
  • SUBSCRIBE

User menu

  • Log in
  • My Cart

Search

  • Advanced search
Journal of Neuroscience
  • Log in
  • My Cart
Journal of Neuroscience

Advanced Search

Submit a Manuscript
  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Collections
    • Podcast
  • ALERTS
  • FOR AUTHORS
    • Information for Authors
    • Fees
    • Journal Clubs
    • eLetters
    • Submit
  • EDITORIAL BOARD
  • ABOUT
    • Overview
    • Advertise
    • For the Media
    • Rights and Permissions
    • Privacy Policy
    • Feedback
  • SUBSCRIBE
PreviousNext
Research Articles, Behavioral/Cognitive

Increased Connectivity among Sensory and Motor Regions during Visual and Audiovisual Speech Perception

Jonathan E. Peelle, Brent Spehar, Michael S. Jones, Sarah McConkey, Joel Myerson, Sandra Hale, Mitchell S. Sommers and Nancy Tye-Murray
Journal of Neuroscience 19 January 2022, 42 (3) 435-442; DOI: https://doi.org/10.1523/JNEUROSCI.0114-21.2021
Jonathan E. Peelle
1Department of Otolaryngology, Washington University in St. Louis, St. Louis, Missouri 63110
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jonathan E. Peelle
Brent Spehar
1Department of Otolaryngology, Washington University in St. Louis, St. Louis, Missouri 63110
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael S. Jones
1Department of Otolaryngology, Washington University in St. Louis, St. Louis, Missouri 63110
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sarah McConkey
1Department of Otolaryngology, Washington University in St. Louis, St. Louis, Missouri 63110
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Joel Myerson
2Department of Psychological and Brain Sciences, Washington University in St. Louis, St. Louis, Missouri 63130
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sandra Hale
2Department of Psychological and Brain Sciences, Washington University in St. Louis, St. Louis, Missouri 63130
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mitchell S. Sommers
2Department of Psychological and Brain Sciences, Washington University in St. Louis, St. Louis, Missouri 63130
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nancy Tye-Murray
1Department of Otolaryngology, Washington University in St. Louis, St. Louis, Missouri 63110
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Article
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF
Loading

Abstract

In everyday conversation, we usually process the talker's face as well as the sound of the talker's voice. Access to visual speech information is particularly useful when the auditory signal is degraded. Here, we used fMRI to monitor brain activity while adult humans (n = 60) were presented with visual-only, auditory-only, and audiovisual words. The audiovisual words were presented in quiet and in several signal-to-noise ratios. As expected, audiovisual speech perception recruited both auditory and visual cortex, with some evidence for increased recruitment of premotor cortex in some conditions (including in substantial background noise). We then investigated neural connectivity using psychophysiological interaction analysis with seed regions in both primary auditory cortex and primary visual cortex. Connectivity between auditory and visual cortices was stronger in audiovisual conditions than in unimodal conditions, including a wide network of regions in posterior temporal cortex and prefrontal cortex. In addition to whole-brain analyses, we also conducted a region-of-interest analysis on the left posterior superior temporal sulcus (pSTS), implicated in many previous studies of audiovisual speech perception. We found evidence for both activity and effective connectivity in pSTS for visual-only and audiovisual speech, although these were not significant in whole-brain analyses. Together, our results suggest a prominent role for cross-region synchronization in understanding both visual-only and audiovisual speech that complements activity in integrative brain regions like pSTS.

SIGNIFICANCE STATEMENT In everyday conversation, we usually process the talker's face as well as the sound of the talker's voice. Access to visual speech information is particularly useful when the auditory signal is hard to understand (e.g., background noise). Prior work has suggested that specialized regions of the brain may play a critical role in integrating information from visual and auditory speech. Here, we show a complementary mechanism relying on synchronized brain activity among sensory and motor regions may also play a critical role. These findings encourage reconceptualizing audiovisual integration in the context of coordinated network activity.

  • audiovisual integration
  • language
  • lipreading
  • speech
  • speechreading

Introduction

Understanding speech in the presence of background noise is notoriously challenging, and when visual speech information is available, listeners make use of it; performance on audiovisual (AV) speech in noise is better than for auditory-only speech in noise (Sumby and Pollack, 1954). Although there is consensus that listeners make use of visual information during speech perception, there is little agreement either on the neural mechanisms that support visual speech processing or on the way in which visual and auditory speech information are combined during audiovisual speech perception.

One long-standing perspective on audiovisual speech has been that auditory and visual information is processed through separate channels and then integrated at a separate processing stage (Grant and Seitz, 1998; Massaro and Palmer, 1998). Audiovisual integration is thus often considered an individual ability that some people are better at and some people are worse at, regardless of their unimodal processing abilities (Magnotti and Beauchamp, 2015; Mallick et al., 2015).

However, more recent data have brought this traditional view into question. For example, Tye-Murray et al. (2016) showed that unimodal auditory-only and visual-only word recognition scores accurately predicted AV performance, and factor analyses revealed two unimodal ability factors with no evidence of a separate integrative ability factor. These findings suggest that rather than a separate stage of audiovisual integration, AV speech perception may depend most strongly on the coordination of auditory and visual inputs (Sommers, 2021).

Theoretical perspectives on audiovisual integration have also informed cognitive neuroscience approaches to AV speech perception. Prior functional neuroimaging studies of audiovisual speech processing have largely focused on identifying brain regions supporting integration. One possibility is that the posterior superior temporal sulcus (pSTS) combines auditory and visual information during speech perception. The pSTS is anatomically positioned between auditory cortex and visual cortex and has the functional properties of a multisensory convergence zone (Beauchamp et al., 2004). During many audiovisual tasks, the pSTS is differentially activated by matching and mismatching auditory-visual information, consistent with a role in integration (Stevenson and James, 2009). Moreover, functional connectivity between the pSTS and primary sensory regions varies with the reliability of the information in a modality (Nath and Beauchamp, 2011), suggesting that the role of the pSTS may be related to combining or weighing information from different senses.

A complementary proposal is that regions of premotor cortex responsible for representing articulatory information are engaged in processing speech (Okada and Hickok, 2009). The contribution of motor regions to speech perception is hotly debated. Evidence consistent with a motor contribution includes a self-advantage in both visual-only and AV speech perception (Tye-Murray et al., 2013, 2015) and effects of visual speech training on speech production (Fridriksson et al., 2009; Venezia et al., 2016). However, premotor activity is not consistently observed in neuroimaging studies of speech perception, and in some instances, may also reflect nonperceptual processing (Szenkovits et al., 2012; Nuttall et al., 2016). It is also possible that premotor regions are only engaged in certain types of speech perception situations (e.g., when there is substantial background noise, or when lipreading); individual differences in hearing sensitivity or lipreading ability also may affect the involvement of premotor cortex.

In addition to looking for brain regions that support visual-only or AV speech perception, we therefore broaden our approach to study the role played by effective connectivity among auditory, visual, and motor regions. If a dedicated brain region is necessary to combine auditory and visual speech information, we would expect to see it active during audiovisual speech. If changes in effective connectivity (Friston, 1994; Stephan and Friston, 2010), that is, task-based synchronized activity, underlie visual-only or audiovisual speech processing, we would expect to see greater connectivity between speech-related regions during these conditions relative to auditory-only speech. In view of these questions, we tested auditory-only speech perception and AV speech perception at a range of signal-to-noise ratios (SNRs) and obtained out-of-scanner measures of lipreading ability from our participants (Fig. 1).

Figure 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 1.

A, Experimental conditions with auditory-only speech, visual-only speech, and audiovisual speech. B, Histogram of lipreading abilities measured outside the scanner. C, Within-scanner behavioral performance (subjective ratings of understanding); individual participants shown in dots. Error bars indicate mean ± SE.

Materials and Methods

Materials

We created seven lists of 50 words. The stimuli were recordings of a female actor speaking single words. The talker sat in front of a neutral background and spoke words along with the carrier phrase “Say the word _______” into the camera. The actor was instructed to allow her mouth to relax to a slightly open and neutral position before each target word was spoken. The edited versions of the recordings used in the current experiment did not include a carrier phrase and were each 1.5 s long. Recordings were made using a Canon Elura 85 digital video camera and showed the talker's head and shoulders. Digital capture and editing were done using Adobe Premiere Elements. The original capture format for the video was uncompressed AVI; the final versions used in the study were compressed as high-quality WMV files. Audio was leveled using Adobe Audition to ensure that each word had the same root mean squared (RMS) amplitude. Conditions that included background noise used RMS-leveled six-talker babble that was mixed and included in the final version of the file.

The 350 recordings used in the study were selected from a corpus of 970 recordings of high-frequency words (log HAL frequency 7.01–14.99) identified using the English Lexicon Project (Balota et al., 2007). The words that were selected for presentation in the lipreading (visual only) or AV conditions in varying SNRs were selected from the larger corpus based on visual-only behavioral performance on each word from 149 participants (22–90 years old) who were tested using the entire corpus. The words selected ranged from 10 to 93% correct in the lipreading-only behavioral tests. They were distributed among the six conditions that included visual information (AV in Quiet, AV +5 SNR, AV 0 SNR, AV −5 SNR, AV −10, and visual only) so they would, on average, be equivalent for lipreading difficulty. The words used in the auditory-only condition were selected from the remaining words.

Participants

We collected data from 60 participants ranging in age from 18 to 34 years (mean = 22.42, SD = 3.24, 45 female). All were right-handed native speakers of American English (no other languages other than English before age 7) who self-reported normal hearing and an absence of neurologic disease. All provided informed consent under a protocol approved by the Washington University in Saint Louis Institutional Review Board.

Experimental design and statistical analysis

Procedure

Before being tested in the fMRI scanner, all participants consented, completed a safety screening, and completed an out-of-scanner lipreading assessment. The behavioral lipreading assessment consisted of 50 single-word clips selected in the same way and taken from the same corpus of recorded material used in the scanner. The lipreading assessment was completed by presenting each video clip to the participant using a laptop. Participants were encouraged to verbally provide their best guess for each clip. Only verbatim responses to the stimuli were considered correct.

Participants were positioned in the scanner with insert earphones, and a viewing mirror was placed above the eyes so they could see a two-sided projection screen located at the head side of the scanner. Those who wore glasses were provided scanner-friendly lenses that fit their prescription. Participants were also given a response box that they held in a comfortable position during testing. Each of the imaging runs presented trials with recordings of audio, visual-only, audiovisual speech stimuli, or printed text via an image projected on the screen that was visible to the participant through the viewing mirror. A camera positioned at the entrance to the scanner bore was used to monitor participant movement. A well-being check and short conversation occurred before each run and, if needed, participants were reminded to stay alert and were asked to try to reduce their movement.

Six runs were completed during the session. Each run lasted ∼5.5 min. The first five runs were perception runs and contained 98 trials each. The stimuli were presented in blocks of five experimental trials plus two null trials for each condition. The result was 14 blocks resulting in 70 experimental trials plus 28 null trials. All trials included 800 ms of quiet without a visual presentation before the stimuli began. During the null trials, participants were presented with a fixation cross instead of the audiovisual presentation. The auditory-only condition did not include visual stimuli; instead, a black screen was presented. The blocks were quasi randomized so that two blocks from the same condition were never presented one right after the other, and one null trial never occurred right after another.

To keep attention high, half the experimental trials required a response from the participant. On response trials, a set of two dots appeared on the screen after the audiovisual/audio presentation. The right-side dot was green and the left-side dot was red. The participants were instructed to use the right-hand button on the response box to indicate yes if they were confident that they had been able to identify the previous word and to use the left-hand button if they felt they had not identified the previous word correctly.

After the initial five runs, a final run of 60 trials was presented in which participants saw a series of written words projected on the screen. The items were the same 50 words used for the behavioral visual-only assessment but did not appear in any of the other fMRI conditions. Each word stayed on the screen for 2.3 s, followed by two green dots that appeared for 2.3 s. Participants were asked to say aloud the word that was presented during the period when the dots were on the screen. Ten null trials were randomly distributed throughout the sequence. Null trials lasted 1.5 s and included a fixation cross on the screen. The reading task was always the final run.

Behavioral data analysis

The out-of-scanner lipreading assessment was scored by taking the percentage of correct responses made by each participant, which we used as a covariate in the fMRI analyses, allowing us to explore patterns of brain activity that related to more successful lipreading ability. The in-scanner lipreading was scored similarly, except scores were based on participants' own judgment of their accuracy. Because we had no way to verify lipreading accuracy in the scanner, we used these to assess qualitative differences in difficulty across condition rather than formal statistical analyses.

MRI data acquisition and analysis

MRI images were acquired on a Siemens Prisma 3T scanner using a 32-channel head coil. Structural images were acquired using a T1-weighted MPRAGE sequence with a voxel size of 0.8 × 0.8 × 0.8 mm. Functional images were acquired using a multiband sequence (Feinberg et al., 2010) in axial orientation with an acceleration factor of 8 (echo time = 37 ms), providing full-brain coverage with a voxel size of 2 × 2 × 2 mm. Each volume took 0.770 s to acquire. We used a sparse imaging paradigm (Edmister et al., 1999; Hall et al., 1999) with a repetition time of 2.47 s, leaving 1.7 s of silence on each trial. We presented words during this silent period, and during the repetition task, we instructed participants to speak during a silent period to minimize the influence of head motion on the data.

Analysis of the MRI data was performed using Automatic Analysis version 5.4.0 (Cusack et al., 2014; RRID:SCR_003560) that scripted a combination of SPM12 version 7487 (Wellcome Trust Center for Neuroimaging; RRID:SCR_007037) and Functional MRI of the Brain Software Library (FSL; Jenkinson et al., 2012) version 6.0.1 (RRID:SCR_002823). Functional images were realigned, coregistered with the structural image, and spatially normalized to Montreal Neurological Institute (MNI) space (including resampling to 2 mm voxels) using unified segmentation (Ashburner and Friston, 2005) before smoothing with an 8 mm FWHM Gaussian kernel. No slice-timing correction was used. First-level models contained regressors for the condition of interest (event onset times convolved with a canonical hemodynamic response function). To reduce the effects of motion on statistical results we calculated framewise displacement (FD) using the six realignment parameters assuming the head as a sphere with a radius of 50 mm (Power et al., 2012). We censored frames exceeding an FD of 0.5, which resulted in ∼8% data loss across all participants (Jones et al., 2021). Frames with FD values exceeding this threshold were modeled out by adding in one additional column to the design matrix for each high-motion scan (compare Lemieux et al., 2007).

Psychophysiological interaction (PPI) analyses are designed to estimate the effective connectivity between brain regions (Friston et al., 1997); that is, the degree to which task demands alter the functional connectivity (i.e., statistical dependence of time series) between a seed region and every other voxel in the brain. PPI analyses thus require identifying a seed region from which to extract a time course and two (or more) tasks between which to compare connectivity with the seed region. For auditory and visual cortex regions of interest (ROIs; see below for definitions), we extracted the time course of the seed region using the SPM volume of interest functionality, summarizing the time course as the first eigenvariate of the ROI after adjusting for effects of interest.

Contrast images from single-subject analyses were analyzed at the second level using permutation testing (FSL randomise tool, 5000 permutations) with a cluster-forming threshold of p < 0.001 (uncorrected) and results corrected for multiple comparisons based on cluster extent (p < 0.05). Anatomical localization was performed using converging evidence from Devlin and Poldrack's (2007) experience viewing statistical maps overlaid in MRIcroGL (Rorden and Brett, 2000), supplemented by atlas labels (Tzourio-Mazoyer et al., 2002).

Regions of interest

We defined ROIs for the left posterior temporal sulcus (pSTS), left primary auditory cortex (A1), and left primary visual cortex (V1). For the pSTS, the ROI was defined as a 10 mm radius sphere centered at MNI coordinates (x = −54, y = −42, z = 4) previously reported to be activated during audiovisual speech processing (Venezia et al., 2017). The ROIs for AI and V1 were defined using the SPM Anatomy Toolbox (Eickhoff et al., 2005; RRID:SCR_013273) as the combination of areas TE 1.0, TE 1.1, and TE 1.2 in the left hemisphere (Morosan et al., 2001) and the left half of area hOC1, respectively. For the non-PPI ROI analysis, data were extracted by taking the mean of all voxels in each ROI.

Data availability

Stimuli, behavioral data, and analysis scripts are available from https://osf.io/qxcu8/. MRI data are available from OpenNeuro (Markiewicz et al., 2021) at https://doi.org/10.18112/openneuro.ds003717.v1.0.0.

Results

Unthresholded statistical maps are available from NeuroVault (Gorgolewski et al., 2015) at https://neurovault.org/collections/10922/.

We first examined whole-brain univariate effects by condition, shown in Figure 2 (maxima listed in Extended Data Figs. 2-1, 2–2, 2–3, 2–4, 2–5, 2–6, and 2–7). We observed temporal lobe activity in all conditions, including visual-only, and visual cortex activity in all conditions except auditory only.

Figure 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 2.

Univariate results for spoken word perception in all experimental conditions. Maxima are listed in Extended Data Figures 2-1, 2-2, 2-3, 2-4, 2-5, 2-6, and 2-7.

Extended Data Figure 2-1

Peak activations A only in quiet. Download Figure 2-1, DOCX file.

Extended Data Figure 2-2

Peak activations lipreading. Download Figure 2-2, DOCX file.

Extended Data Figure 2-3

Peak activations AV quiet. Download Figure 2-3, DOCX file.

Extended Data Figure 2-4

Peak activations for AV +5. Download Figure 2-4, DOCX file.

Extended Data Figure 2-5

Peak activations for AV 0. Download Figure 2-5, DOCX file.

Extended Data Figure 2-6

Peak activations AV −5. Download Figure 2-6, DOCX file.

Extended Data Figure 2-7

Peak activations AV −10. Download Figure 2-7, DOCX file.

Figure 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 3.

Psychophysiological interaction analysis for experimental conditions, using a seed from left visual cortex. Warm-colored voxels showed significantly more connectivity with visual cortex in an experimental condition than in the auditory-only condition. Maxima are listed in Figures 3-1, 3-2, 3-3, 3-4, 3-5, and 3-6.

Extended Data Figure 3-1

Peak activations for PPI (V1 seed) AV −10 > A. Download Figure 3-1, DOCX file.

Extended Data Figure 3-2

Peak activations for PPI (V1 seed) AV −5 > A. Download Figure 3-2, DOCX file.

Extended Data Figure 3-3

Peak activations for PPI (V1 seed) AV 0 > A. Download Figure 3-3, DOCX file.

Extended Data Figure 3-4

Peak activations for PPI (V1 seed) AV +5 > A. Download Figure 3-4, DOCX file.

Extended Data Figure 3-5

Peak activations for PPI (V1 seed) AV quiet > A. Download Figure 3-5, DOCX file.

Extended Data Figure 3-6

Peak activations for PPI (V1 seed) A > V. Download Figure 3-6, DOCX file.

Figure 4.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 4.

Psychophysiological interaction analysis for experimental conditions, using a seed from left auditory cortex. Warm-colored voxels showed significantly more connectivity with auditory cortex in an experimental condition than in the visual-only condition. Maxima are listed in Figures 4-1, 4-2, 4-3, 4-4, 4-5, and 4-6.

Extended Data Figure 4-1

Peak activations for PPI (A1 seed) AV −10 > V. Download Figure 4-1, DOCX file.

Extended Data Figure 4-2

Peak activations for PPI (A1 seed) AV −5 > V. Download Figure 4-2, DOCX file.

Extended Data Figure 4-3

Peak activations for PPI (A1 seed) AV 0 > V. Download Figure 4-3, DOCX file.

Extended Data Figure 4-4

Peak activations for PPI (A1 seed) AV +5 > V. Download Figure 4-4, DOCX file.

Extended Data Figure 4-5

Peak activations for PPI (A1 seed) AV quiet > V. Download Figure 4-5, DOCX file.

Extended Data Figure 4-6

Peak activations Peak activations for PPI (A1 seed) A only > V. Download Figure 4-6, DOCX file.

Figure 5.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 5.

Region-of-interest analyses highlighting the role of the left pSTS in speech processing. A, pSTS activity for univariate analyses (compare Fig. 2). B, PPI-based effective connectivity with V1 (compare Fig. 3). C, PPI-based effective connectivity with A1 (compare Fig. 4). Significant differences from zero, corrected for multiple comparisons, are indicated with an asterisk.

We next related the activity during visual-only speech with the out-of-scanner lipreading score (Fig. 1B). Across participants, lipreading accuracy ranged from 4 to 74% (mean = 47.75, SD = 15.49), and correlated with in-scanner ratings (Spearman's ρ = 0.38). We included out-of-scanner lipreading as a covariate to see whether individual differences in out-of-scanner scores related to visual-only activity; we did not find any significant relationship (positive or negative).

Following univariate analyses, we examined effective connectivity using PPI models. We started by using a seed region in left visual cortex. As seen in Figure 3 (maxima listed in Extended Data Figs. 3-1, 3–2, 3–3, 3–4, 3–5, and 3–6), compared with auditory-only speech, visual-only and all audiovisual conditions showed increased connectivity with the visual cortex seed, notably including bilateral superior temporal gyrus and auditory cortex. The same was true with an auditory cortex seed, shown in Figure 4 (maxima listed in Extended Data Figs. 4-1, 4–2, 4–3, 4–4, 4–5, and 4–6). Here, compared with the visual-only condition, we see increased connectivity with visual cortex in all conditions except the auditory-only condition.

Finally, to complement the above whole-brain analyses, we conducted an ROI analyses focusing on pSTS, shown in Figure 5. For the whole-brain univariate and PPI analyses described above, we extracted values from left pSTS and used one-sample t tests to see whether activity was significantly different from zero. Significance (p < 0.05, Bonferroni corrected for 19 tests, giving p < 0.00263) is indicated above each condition.

Discussion

We studied brain activity during visual-only and audiovisual speech perception. We found that connectivity between auditory, visual, and premotor cortex was enhanced during audiovisual speech processing relative to unimodal processing and during visual-only speech processing relative to auditory-only speech processing. These findings are broadly consistent with a role for synchronized interregional neural activity supporting visual and audiovisual speech perception.

Dedicated regions for multisensory speech processing

Although understanding audiovisual speech requires combining information from multiple modalities, the way this happens is unclear. One possibility is that heteromodal brain regions such as the pSTS act to integrate unisensory inputs. In addition to combining signals to form a unitary percept, regions such as pSTS may also give more weight to more informative modalities (e.g., to the visual signal when the auditory signal is noisy; Nath and Beauchamp, 2011).

Activity in pSTS for visual-only or AV speech was suggested by both our whole-brain and ROI-based analyses, consistent with a role for pSTS in integrating or combining auditory and visual information. Of course, pSTS activity is not always observed for AV speech (Erickson et al., 2014). One potential explanation for the variability in pSTS activation across studies is the nature of the speech materials. Several previous studies identifying pSTS involvement in multisensory speech perception have used incongruent stimuli (i.e., a McGurk task; McGurk and MacDonald, 1976), which differs substantially from most of our everyday speech perception experience (Van Engen et al., 2019). Thus, the conditions under which pSTS is recruited to support visual or AV speech perception remains an open question.

In our univariate results, we observed activity in premotor cortex for both visual-only speech in quiet and AV speech at more challenging signal-to-noise ratios. These findings are consistent with a flexible role for premotor cortex in speech perception, at least under some circumstances, as reported in other studies of visual and audiovisual speech perception (Venezia et al., 2017). Although our current data do not support specific conclusions, the dependence of premotor activity on task demands may explain some of the inconsistencies underlying the debates about the role of premotor cortex that permeate the speech perception literature.

Effective connectivity and multisensory speech processing

A different perspective comes from a focus on multisensory effects in auditory and visual cortex (Peelle and Sommers, 2015). Much of the support for this early integration view comes from electrophysiology studies showing multimodal effects in primary sensory regions (e.g., Schroeder and Foxe, 2005). For example, Lakatos et al. (2007) found that somatosensory input reset the phase of ongoing neural oscillations in auditory cortex, which was hypothesized to increase sensitivity to auditory stimuli. In at least one human magnetoencephalography study, audiovisual effects appear sooner in auditory cortex than in pSTS (Möttönen et al., 2004), and visual speech may speed processing in auditory cortex (van Wassenhove et al., 2005). These findings suggest that multisensory effects are present in primary sensory regions and that auditory and visual information do not require a separate brain region in which to integrate.

In the current data, we observed stronger connectivity between auditory and visual cortex for visual-only and audiovisual speech conditions than for unimodal auditory-only speech and stronger connectivity in audiovisual speech conditions than in unimodal visual-only speech. That is, using a visual cortex seed we found increases in effective connectivity with auditory cortex, and when using an auditory cortex seed we found increases in effective connectivity with visual cortex. These complementary findings indicate that functionally coordinated activity among primary sensory regions is increased during audiovisual speech perception.

Beyond primary sensory cortices, we also observed effective connectivity changes to premotor cortex for both visual-only speech and several audiovisual conditions. The functional synchronization among visual cortex, auditory cortex, and premotor cortex is consistent with a distributed network that orchestrates activity in response to visual-only and audiovisual speech.

Finally, our ROI analysis showed increased effective connectivity between pSTS and V1, but not A1, under most experimental conditions (Fig. 5). These effective connectivity changes with V1 are consistent with a role for pSTS in audiovisual speech processing. However, they are also not easily reconcilable with studies reporting connectivity differences between pSTS and both A1 and V1 (Nath and Beauchamp, 2011). Although no doubt the location and size of any pSTS ROI chosen is important, we used the same ROI for the PPI analyses with both the A1 seed and V1 seed, and so ROI definition alone does not seem to explain the qualitative difference between the two.

It may be worth considering whether the pSTS plays a different role in relation to A1 and V1. Just because pSTS responds to both auditory and visual information does not necessarily mean it treats them equally or integrates them in a modality-agnostic manner. Indeed, given that unisensory cortices show multisensory effects and anatomic connections (Cappe and Barone, 2005), heteromodal or multisensory regions can also exhibit modality preferences (Noyce et al., 2017). In many audiovisual tasks, auditory information appears to be preferentially processed (Grondin and Rousseau, 1991; Recanzone, 2003; Grondin and McAuley, 2009; Grahn et al., 2011). Thus, pSTS may be particularly important in integrating visual information into an existing auditory-dominated percept. Relatedly, it could also be that multimodal information is inextricably bound at early stages of perception (Rosenblum, 2008), a process which may rely on pSTS.

The emerging picture is one in which coordination of large-scale brain networks, that is, effective connectivity reflecting time-locked functional processing, is associated with visual-only and audiovisual speech processing. What might be the function of such distributed, coordinated activity? Visual and audiovisual speech appear to rely on multisensory representations. For audiovisual speech, it may seem obvious that successful perception requires combining auditory and visual information. However, visual-only speech has been consistently associated with activity in auditory cortex (Calvert et al., 1997; Okada et al., 2013). These activations may correspond to visual-auditory associations and auditory-motor associations, learned from audiovisual speech that are automatically reactivated, even when the auditory input is absent.

Interestingly, our out-of-scanner lipreading scores did not correlate with any of the whole-brain results. It should be noted, however, that our sample size, although large for fMRI studies of audiovisual speech processing, may still be too small to reliably detect individual differences in brain activity patterns (Yarkoni and Braver, 2010). Moreover, there may be multiple ways that brains can support better lipreading, and such heterogeneity in brain patterns would not be evident in our current analyses. Future studies with larger sample sizes may be needed to quantitatively assess the degree to which users' activity might fall into neural strategies and the degree to which these are related to lipreading performance.

It is worth highlighting an intriguing aspect of our data, which is that auditory cortex is always engaged, even in visual-only conditions, whereas the reverse is not true for visual cortex, which is only engaged when visual information is present (Fig. 2). This observation may relate to deeper theoretical issues regarding the fundamental modality of speech representation. That is, if auditory representations have primacy (at least, for hearing people), we might expect these representations to be activated regardless of the input modality (i.e., for both auditory and visual speech). In fact, this is exactly what we have observed. Although these findings do not directly address the level of detail contained in visual cortex speech representations (Bernstein and Liebenthal, 2014), they are consistent with asymmetric auditory and visual speech representations.

Different perspectives on multisensory integration during speech perception

An enduring challenge for understanding multisensory speech perception can be found in differing uses of the word integration. During audiovisual speech perception, listeners use both auditory and visual information, and so from one perspective, both kinds of information are necessarily integrated into a listener's (unified) perceptual experience. However, such use of both auditory and visual information does not necessitate a separable cognitive stage for integration (Tye-Murray et al., 2016; Sommers, 2021), nor does it necessitate a region of the brain devoted to integration. The interregional coordination we observed here may accomplish the task of integration in that both auditory and visual modalities are shaping perception. In this framework, there is no need to first translate visual and auditory speech information into some kind of common code (Altieri et al., 2011).

With any study it is important to consider how the specific stimuli used influenced the results. Here, we examined processing for single words. Visual speech can inform perception in multiple dimensions (Peelle and Sommers, 2015), including by providing clues to the speech envelope (Chandrasekaran et al., 2009). These clues may be more influential in connected speech (e.g., sentences) than in single words, as other neural processes may come into play with connected speech.

Conclusion

Our findings demonstrate the scaffolding of connectivity among auditory, visual, and premotor cortices that supports visual-only and audiovisual speech perception. These findings suggest that the binding of multisensory information need not be restricted to heteromodal brain regions (e.g., pSTS) but may also emerge from coordinated unimodal activity throughout the brain.

Footnotes

  • This work was supported by National Institutes of Health Grants R56 AG018029 and R01 DC016594. The multiband echoplanar imaging sequence was provided by the University of Minnesota Center for Magnetic Resonance Research.

  • The authors declare no competing financial interests.

  • Correspondence should be addressed to Jonathan E. Peelle at jpeelle{at}wustl.edu

SfN exclusive license.

References

  1. ↵
    1. Altieri N,
    2. Pisoni DB,
    3. Townsend JT
    (2011) Some behavioral and neurobiological constraints on theories of audiovisual speech integration: a review and suggestions for new directions. Seeing Perceiving 24:513–539. doi:10.1163/187847611X595864 pmid:21968081
    OpenUrlCrossRefPubMed
  2. ↵
    1. Ashburner J,
    2. Friston KJ
    (2005) Unified segmentation. Neuroimage 26:839–851. doi:10.1016/j.neuroimage.2005.02.018 pmid:15955494
    OpenUrlCrossRefPubMed
  3. ↵
    1. Balota DA,
    2. Yap MJ,
    3. Cortese MJ,
    4. Hutchison KA,
    5. Kessler B,
    6. Loftis B,
    7. Neely JH,
    8. Nelson DL,
    9. Simpson GB,
    10. Treiman R
    (2007) The English Lexicon Project. Behav Res Methods 39:445–459.
    OpenUrlCrossRefPubMed
  4. ↵
    1. Beauchamp MS,
    2. Argall BD,
    3. Bodurka J,
    4. Duyn JH,
    5. Martin A
    (2004) Unraveling multisensory integration: patchy organization within human STS multisensory cortex. Nat Neurosci 7:1190–1192. doi:10.1038/nn1333 pmid:15475952
    OpenUrlCrossRefPubMed
  5. ↵
    1. Bernstein LE,
    2. Liebenthal E
    (2014) Neural pathways for visual speech perception. Front Neurosci 8:386. doi:10.3389/fnins.2014.00386 pmid:25520611
    OpenUrlCrossRefPubMed
  6. ↵
    1. Calvert GA,
    2. Bullmore ET,
    3. Brammer MJ,
    4. Campbell R,
    5. Williams SCR,
    6. McGuire PK,
    7. Woodruff PWR,
    8. Iversen SD,
    9. David AS
    (1997) Activation of auditory cortex during silent lipreading. Science 276:593–596. doi:10.1126/science.276.5312.593 pmid:9110978
    OpenUrlAbstract/FREE Full Text
  7. ↵
    1. Cappe C,
    2. Barone P
    (2005) Heteromodal connections supporting multisensory integration at low levels of cortical processing in the monkey. Eur J Neurosci 22:2886–2902. doi:10.1111/j.1460-9568.2005.04462.x pmid:16324124
    OpenUrlCrossRefPubMed
  8. ↵
    1. Chandrasekaran C,
    2. Trubanova A,
    3. Stillittano S,
    4. Caplier A,
    5. Ghazanfar AA
    (2009) The natural statistics of audiovisual speech. PLoS Comput Biol 5:e1000436. doi:10.1371/journal.pcbi.1000436 pmid:19609344
    OpenUrlCrossRefPubMed
  9. ↵
    1. Cusack R,
    2. Vicente-Grabovetsky A,
    3. Mitchell DJ,
    4. Wild CJ,
    5. Auer T,
    6. Linke AC,
    7. Peelle JE
    (2014) Automatic analysis (aa): efficient neuroimaging workflows and parallel processing using Matlab and XML. Front Neuroinform 8:90. doi:10.3389/fninf.2014.00090 pmid:25642185
    OpenUrlCrossRefPubMed
  10. ↵
    1. Devlin JT,
    2. Poldrack RA
    (2007) In praise of tedious anatomy. Neuroimage 37:1033–1041. doi:10.1016/j.neuroimage.2006.09.055 pmid:17870621
    OpenUrlCrossRefPubMed
  11. ↵
    1. Edmister WB,
    2. Talavage TM,
    3. Ledden PJ,
    4. Weisskoff RM
    (1999) Improved auditory cortex imaging using clustered volume acquisitions. Hum Brain Mapp 7:89–97. doi:10.1002/(SICI)1097-0193(1999)7:2<89::AID-HBM2>3.0.CO;2-N
    OpenUrlCrossRefPubMed
  12. ↵
    1. Eickhoff SB,
    2. Stephan KE,
    3. Mohlberg H,
    4. Grefkes C,
    5. Fink GR,
    6. Amunts K,
    7. Zilles K
    (2005) A new SPM toolbox for combining probabilistic cytoarchitectonic maps and functional imaging data. Neuroimage 25:1325–1335. doi:10.1016/j.neuroimage.2004.12.034 pmid:15850749
    OpenUrlCrossRefPubMed
  13. ↵
    1. Erickson LC,
    2. Heeg E,
    3. Rauschecker JP,
    4. Turkeltaub PE
    (2014) An ALE meta-analysis on the audiovisual integration of speech signals. Hum Brain Mapp 35:5587–5605. doi:10.1002/hbm.22572 pmid:24996043
    OpenUrlCrossRefPubMed
  14. ↵
    1. Feinberg DA,
    2. Moeller S,
    3. Smith SM,
    4. Auerbach E,
    5. Ramanna S,
    6. Gunther M,
    7. Glasser MF,
    8. Miller KL,
    9. Ugurbil K,
    10. Yacoub E
    (2010) Multiplexed echo planar imaging for sub-second whole brain FMRI and fast diffusion imaging. PLoS One 5:e15710. doi:10.1371/journal.pone.0015710 pmid:21187930
    OpenUrlCrossRefPubMed
  15. ↵
    1. Fridriksson J,
    2. Baker JM,
    3. Whiteside J,
    4. Eoute D Jr.,
    5. Moser D,
    6. Vesselinov R,
    7. Rorden C
    (2009) Treating visual speech perception to improve speech production in nonfluent aphasia. Stroke 40:853–858. doi:10.1161/STROKEAHA.108.532499 pmid:19164782
    OpenUrlAbstract/FREE Full Text
  16. ↵
    1. Friston KJ
    (1994) Functional and effective connectivity in neuroimaging: a synthesis. Hum Brain Mapp 2:56–78. doi:10.1002/hbm.460020107
    OpenUrlCrossRef
  17. ↵
    1. Friston KJ,
    2. Buechel C,
    3. Fink GR,
    4. Morris J,
    5. Rolls E,
    6. Dolan RJ
    (1997) Psychophysiological and modulatory interactions in neuroimaging. Neuroimage 6:218–229. doi:10.1006/nimg.1997.0291 pmid:9344826
    OpenUrlCrossRefPubMed
  18. ↵
    1. Gorgolewski KJ,
    2. Varoquaux G,
    3. Rivera G,
    4. Schwarz Y,
    5. Ghosh SS,
    6. Maumet C,
    7. Sochat VV,
    8. Nichols TE,
    9. Poldrack RA,
    10. Poline J-B,
    11. Yarkoni T,
    12. Margulies DS
    (2015) NeuroVault.org: a web-based repository for collecting and sharing unthresholded statistical maps of the human brain. Front Neuroinform 9:8. doi:10.3389/fninf.2015.00008 pmid:25914639
    OpenUrlCrossRefPubMed
  19. ↵
    1. Grahn JA,
    2. Henry MJ,
    3. McAuley JD
    (2011) FMRI investigation of cross-modal interactions in beat perception: audition primes vision, but not vice versa. Neuroimage 54:1231–1243. doi:10.1016/j.neuroimage.2010.09.033 pmid:20858544
    OpenUrlCrossRefPubMed
  20. ↵
    1. Grant KW,
    2. Seitz PF
    (1998) Measures of auditory-visual integration in nonsense syllables and sentences. J Acoust Soc Am 104:2438–2450. doi:10.1121/1.423751 pmid:10491705
    OpenUrlCrossRefPubMed
  21. ↵
    1. Grondin S,
    2. Rousseau R
    (1991) Judging the relative duration of multimodal short empty time intervals. Percept Psychophys 49:245–256. doi:10.3758/bf03214309 pmid:2011462
    OpenUrlCrossRefPubMed
  22. ↵
    1. Grondin S,
    2. McAuley D
    (2009) Duration discrimination in crossmodal sequences. Perception 38:1542–1559. doi:10.1068/p6359 pmid:19950485
    OpenUrlCrossRefPubMed
  23. ↵
    1. Hall DA,
    2. Haggard MP,
    3. Akeroyd MA,
    4. Palmer AR,
    5. Summerfield AQ,
    6. Elliott MR,
    7. Gurney EM,
    8. Bowtell RW
    (1999) “Sparse” temporal sampling in auditory fMRI. Hum Brain Mapp 7:213–223. doi:10.1002/(SICI)1097-0193(1999)7:3<213::AID-HBM5>3.0.CO;2-N
    OpenUrlCrossRefPubMed
  24. ↵
    1. Jenkinson M,
    2. Beckmann CF,
    3. Behrens TEJ,
    4. Woolrich MW,
    5. Smith SM
    (2012) FSL. Neuroimage 62:782–790. doi:10.1016/j.neuroimage.2011.09.015 pmid:21979382
    OpenUrlCrossRefPubMed
  25. ↵
    1. Jones MS,
    2. Zhu Z,
    3. Bajracharya A,
    4. Luor A,
    5. Peelle JE
    (2021) A multi-dataset evaluation of frame censoring for task-based fMRI. bioRxiv 464075. doi: 10.1101/2021.10.12.464075. doi:10.1101/2021.10.12.464075
    OpenUrlCrossRef
  26. ↵
    1. Lakatos P,
    2. Chen C-M,
    3. O'Connell MN,
    4. Mills A,
    5. Schroeder CE
    (2007) Neuronal oscillations and multisensory interaction in primary auditory cortex. Neuron 53:279–292. doi:10.1016/j.neuron.2006.12.011 pmid:17224408
    OpenUrlCrossRefPubMed
  27. ↵
    1. Lemieux L,
    2. Salek-Haddadi A,
    3. Lund TE,
    4. Laufs H,
    5. Carmichael D
    (2007) Modelling large motion events in fMRI studies of patients with epilepsy. Magn Reson Imaging 25:894–901. doi:10.1016/j.mri.2007.03.009 pmid:17490845
    OpenUrlCrossRefPubMed
  28. ↵
    1. Magnotti JF,
    2. Beauchamp MS
    (2015) The noisy encoding of disparity model of the McGurk effect. Psychon Bull Rev 22:701–709. doi:10.3758/s13423-014-0722-2 pmid:25245268
    OpenUrlCrossRefPubMed
  29. ↵
    1. Mallick DB,
    2. Magnotti JF,
    3. Beauchamp MS
    (2015) Variability and stability in the McGurk effect: contributions of participants, stimuli, time, and response type. Psychon Bull Rev 22:1299–1307. doi:10.3758/s13423-015-0817-4 pmid:25802068
    OpenUrlCrossRefPubMed
  30. ↵
    1. Markiewicz CJ,
    2. Gorgolewski KJ,
    3. Feingold F,
    4. Blair R,
    5. Halchenko YO,
    6. Miller E,
    7. Hardcastle N,
    8. Wexler J,
    9. Esteban O,
    10. Goncalves M,
    11. Jwa A,
    12. Poldrack R
    (2021) The OpenNeuro resource for sharing of neuroscience data. Elife. 10:e71774.
    OpenUrl
  31. ↵
    1. Massaro DW,
    2. Palmer SE Jr.
    (1998) Perceiving talking faces: from speech perception to a behavioral principle. Cambridge, MA: MIT.
  32. ↵
    1. McGurk H,
    2. MacDonald J
    (1976) Hearing lips and seeing voices. Nature 264:746–748. doi:10.1038/264746a0 pmid:1012311
    OpenUrlCrossRefPubMed
  33. ↵
    1. Morosan P,
    2. Rademacher J,
    3. Schleicher A,
    4. Amunts K,
    5. Schormann T,
    6. Zilles K
    (2001) Human primary auditory cortex: cytoarchitectonic subdivisions and mapping into a spatial reference system. Neuroimage 13:684–701. doi:10.1006/nimg.2000.0715 pmid:11305897
    OpenUrlCrossRefPubMed
  34. ↵
    1. Möttönen R,
    2. Schürmann M,
    3. Sams M
    (2004) Time course of multisensory interactions during audiovisual speech perception in humans: a magnetoencephalographic study. Neurosci Lett 363:112–115. doi:10.1016/j.neulet.2004.03.076 pmid:15172096
    OpenUrlCrossRefPubMed
  35. ↵
    1. Nath AR,
    2. Beauchamp MS
    (2011) Dynamic changes in superior temporal sulcus connectivity during perception of noisy audiovisual speech. J Neurosci 31:1704–1714. doi:10.1523/JNEUROSCI.4853-10.2011 pmid:21289179
    OpenUrlAbstract/FREE Full Text
  36. ↵
    1. Noyce AL,
    2. Cestero N,
    3. Michalka SW,
    4. Shinn-Cunningham BG,
    5. Somers DC
    (2017) Sensory-biased and multiple-demand processing in human lateral frontal cortex. J Neurosci 37:8755–8766. doi:10.1523/JNEUROSCI.0660-17.2017 pmid:28821668
    OpenUrlAbstract/FREE Full Text
  37. ↵
    1. Nuttall HE,
    2. Kennedy-Higgins D,
    3. Hogan J,
    4. Devlin JT,
    5. Adank P
    (2016) The effect of speech distortion on the excitability of articulatory motor cortex. Neuroimage 128:218–226. doi:10.1016/j.neuroimage.2015.12.038 pmid:26732405
    OpenUrlCrossRefPubMed
  38. ↵
    1. Okada K,
    2. Hickok G
    (2009) Two cortical mechanisms support the integration of visual and auditory speech: a hypothesis and preliminary data. Neurosci Lett 452:219–223. doi:10.1016/j.neulet.2009.01.060 pmid:19348727
    OpenUrlCrossRefPubMed
  39. ↵
    1. Okada K,
    2. Venezia JH,
    3. Matchin W,
    4. Saberi K,
    5. Hickok G
    (2013) An fMRI study of audiovisual speech perception reveals multisensory interactions in auditory cortex. PLoS One 8:e68959. doi:10.1371/journal.pone.0068959 pmid:23805332
    OpenUrlCrossRefPubMed
  40. ↵
    1. Peelle JE,
    2. Sommers MS
    (2015) Prediction and constraint in audiovisual speech perception. Cortex 68:169–181. doi:10.1016/j.cortex.2015.03.006 pmid:25890390
    OpenUrlCrossRefPubMed
  41. ↵
    1. Power JD,
    2. Barnes KA,
    3. Snyder AZ,
    4. Schlaggar BL,
    5. Petersen SE
    (2012) Spurious but systematic correlations in functional connectivity MRI networks arise from subject motion. Neuroimage 59:2142–2154. doi:10.1016/j.neuroimage.2011.10.018 pmid:22019881
    OpenUrlCrossRefPubMed
  42. ↵
    1. Recanzone GH
    (2003) Auditory influences on visual temporal rate perception. J Neurophysiol 89:1078–1093. doi:10.1152/jn.00706.2002 pmid:12574482
    OpenUrlCrossRefPubMed
  43. ↵
    1. Rorden C,
    2. Brett M
    (2000) Stereotaxic display of brain lesions. Behav Neurol 12:191–2000. doi:10.1155/2000/421719 pmid:11568431
    OpenUrlCrossRefPubMed
  44. ↵
    1. Rosenblum LD
    (2008) Speech perception as a multimodal phenomenon. Curr Dir Psychol Sci 17:405–409. doi:10.1111/j.1467-8721.2008.00615.x pmid:23914077
    OpenUrlCrossRefPubMed
  45. ↵
    1. Schroeder CE,
    2. Foxe J
    (2005) Multisensory contributions to low-level, “unisensory” processing. Curr Opin Neurobiol 15:454–458. doi:10.1016/j.conb.2005.06.008 pmid:16019202
    OpenUrlCrossRefPubMed
  46. ↵
    1. Sommers MS
    (2021) Santa Claus, the tooth fairy, and auditory-visual integration. The Handbook of Speech Perception, pp 517–539.
  47. ↵
    1. Stephan KE,
    2. Friston KJ
    (2010) Analyzing effective connectivity with functional magnetic resonance imaging. Wiley Interdiscip Rev Cogn Sci 1:446–459. doi:10.1002/wcs.58 pmid:21209846
    OpenUrlCrossRefPubMed
  48. ↵
    1. Stevenson RA,
    2. James TW
    (2009) Audiovisual integration in human superior temporal sulcus: inverse effectiveness and the neural processing of speech and object recognition. Neuroimage 44:1210–1223. doi:10.1016/j.neuroimage.2008.09.034 pmid:18973818
    OpenUrlCrossRefPubMed
  49. ↵
    1. Sumby WH,
    2. Pollack I
    (1954) Visual contribution to speech intelligibility in noise. J Acoust Soc Am 26:212–215. doi:10.1121/1.1907309
    OpenUrlCrossRef
  50. ↵
    1. Szenkovits G,
    2. Peelle JE,
    3. Norris D,
    4. Davis MH
    (2012) Individual differences in premotor and motor recruitment during speech perception. Neuropsychologia 50:1380–1392. doi:10.1016/j.neuropsychologia.2012.02.023 pmid:22521874
    OpenUrlCrossRefPubMed
  51. ↵
    1. Tye-Murray N,
    2. Spehar BP,
    3. Myerson J,
    4. Hale S,
    5. Sommers MS
    (2013) Reading your own lips: common-coding theory and visual speech perception. Psychon Bull Rev 20:115–119. doi:10.3758/s13423-012-0328-5 pmid:23132604
    OpenUrlCrossRefPubMed
  52. ↵
    1. Tye-Murray N,
    2. Spehar BP,
    3. Myerson J,
    4. Hale S,
    5. Sommers MS
    (2015) The self-advantage in visual speech processing enhances audiovisual speech recognition in noise. Psychon Bull Rev 22:1048–1053. doi:10.3758/s13423-014-0774-3 pmid:25421408
    OpenUrlCrossRefPubMed
  53. ↵
    1. Tye-Murray N,
    2. Spehar B,
    3. Myerson J,
    4. Hale S,
    5. Sommers M
    (2016) Lipreading and audiovisual speech recognition across the adult lifespan: implications for audiovisual integration. Psychol Aging 31:380–389. doi:10.1037/pag0000094 pmid:27294718
    OpenUrlCrossRefPubMed
  54. ↵
    1. Tzourio-Mazoyer N,
    2. Landeau B,
    3. Papathanassiou D,
    4. Crivello F,
    5. Etard O,
    6. Delcroix N,
    7. Mazoyer B,
    8. Joliot M
    (2002) Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. Neuroimage 15:273–289. doi:10.1006/nimg.2001.0978 pmid:11771995
    OpenUrlCrossRefPubMed
  55. ↵
    1. Van Engen KJ,
    2. Dey A,
    3. Sommers M,
    4. Peelle JE
    (2019) Audiovisual speech perception: moving beyond McGurk. Available at psyarxiv.com/6y8qw.
  56. ↵
    1. van Wassenhove V,
    2. Grant KW,
    3. Poeppel D
    (2005) Visual speech speeds up the neural processing of auditory speech. Proc Natl Acad Sci U S A 102:1181–1186. doi:10.1073/pnas.0408949102 pmid:15647358
    OpenUrlAbstract/FREE Full Text
  57. ↵
    1. Venezia JH,
    2. Fillmore P,
    3. Matchin W,
    4. Isenberg AL,
    5. Hickok G,
    6. Fridriksson J
    (2016) Perception drives production across sensory modalities: a network for sensorimotor integration of visual speech. Neuroimage 126:196–207. doi:10.1016/j.neuroimage.2015.11.038 pmid:26608242
    OpenUrlCrossRefPubMed
  58. ↵
    1. Venezia JH,
    2. Vaden KI Jr.,
    3. Rong F,
    4. Maddox D,
    5. Saberi K,
    6. Hickok G
    (2017) Auditory, visual and audiovisual speech processing streams in superior temporal sulcus. Front Hum Neurosci 11:174. doi:10.3389/fnhum.2017.00174 pmid:28439236
    OpenUrlCrossRefPubMed
  59. ↵
    1. Yarkoni T,
    2. Braver TS
    (2010) Cognitive neurosciences approaches to individual differences in working memory and executive control: conceptual and methodological issues. In: Handbook of individual differences in cognition. (Gruszka A, Matthews G, Szymura B, eds), pp 87–107. New York: Springer.
Back to top

In this issue

The Journal of Neuroscience: 42 (3)
Journal of Neuroscience
Vol. 42, Issue 3
19 Jan 2022
  • Table of Contents
  • Table of Contents (PDF)
  • About the Cover
  • Index by author
  • Ed Board (PDF)
Email

Thank you for sharing this Journal of Neuroscience article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
Increased Connectivity among Sensory and Motor Regions during Visual and Audiovisual Speech Perception
(Your Name) has forwarded a page to you from Journal of Neuroscience
(Your Name) thought you would be interested in this article in Journal of Neuroscience.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Print
View Full Page PDF
Citation Tools
Increased Connectivity among Sensory and Motor Regions during Visual and Audiovisual Speech Perception
Jonathan E. Peelle, Brent Spehar, Michael S. Jones, Sarah McConkey, Joel Myerson, Sandra Hale, Mitchell S. Sommers, Nancy Tye-Murray
Journal of Neuroscience 19 January 2022, 42 (3) 435-442; DOI: 10.1523/JNEUROSCI.0114-21.2021

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Respond to this article
Request Permissions
Share
Increased Connectivity among Sensory and Motor Regions during Visual and Audiovisual Speech Perception
Jonathan E. Peelle, Brent Spehar, Michael S. Jones, Sarah McConkey, Joel Myerson, Sandra Hale, Mitchell S. Sommers, Nancy Tye-Murray
Journal of Neuroscience 19 January 2022, 42 (3) 435-442; DOI: 10.1523/JNEUROSCI.0114-21.2021
del.icio.us logo Digg logo Reddit logo Twitter logo Facebook logo Google logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • Abstract
    • Introduction
    • Materials and Methods
    • Results
    • Discussion
    • Footnotes
    • References
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF

Keywords

  • audiovisual integration
  • language
  • lipreading
  • speech
  • speechreading

Responses to this article

Respond to this article

Jump to comment:

No eLetters have been published for this article.

Related Articles

Cited By...

More in this TOC Section

Research Articles

  • Rhythmic Entrainment Echoes in Auditory Perception
  • Multimodal Imaging for Validation and Optimization of Ion Channel-Based Chemogenetics in Nonhuman Primates
  • Cleavage of VAMP2/3 Affects Oligodendrocyte Lineage Development in the Developing Mouse Spinal Cord
Show more Research Articles

Behavioral/Cognitive

  • NMDA Receptors in the Basolateral Amygdala Complex Are Engaged for Pavlovian Fear Conditioning When an Animal’s Predictions about Danger Are in Error
  • Signatures of Electrical Stimulation Driven Network Interactions in the Human Limbic System
  • Dissociable Neural Mechanisms Underlie the Effects of Attention on Visual Appearance and Response Bias
Show more Behavioral/Cognitive
  • Home
  • Alerts
  • Visit Society for Neuroscience on Facebook
  • Follow Society for Neuroscience on Twitter
  • Follow Society for Neuroscience on LinkedIn
  • Visit Society for Neuroscience on Youtube
  • Follow our RSS feeds

Content

  • Early Release
  • Current Issue
  • Issue Archive
  • Collections

Information

  • For Authors
  • For Advertisers
  • For the Media
  • For Subscribers

About

  • About the Journal
  • Editorial Board
  • Privacy Policy
  • Contact
(JNeurosci logo)
(SfN logo)

Copyright © 2023 by the Society for Neuroscience.
JNeurosci Online ISSN: 1529-2401

The ideas and opinions expressed in JNeurosci do not necessarily reflect those of SfN or the JNeurosci Editorial Board. Publication of an advertisement or other product mention in JNeurosci should not be construed as an endorsement of the manufacturer’s claims. SfN does not assume any responsibility for any injury and/or damage to persons or property arising from or related to any use of any material contained in JNeurosci.