Skip to main content

Main menu

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Collections
    • Podcast
  • ALERTS
  • FOR AUTHORS
    • Information for Authors
    • Fees
    • Journal Clubs
    • eLetters
    • Submit
    • Special Collections
  • EDITORIAL BOARD
    • Editorial Board
    • ECR Advisory Board
    • Journal Staff
  • ABOUT
    • Overview
    • Advertise
    • For the Media
    • Rights and Permissions
    • Privacy Policy
    • Feedback
    • Accessibility
  • SUBSCRIBE

User menu

  • Log out
  • Log in
  • My Cart

Search

  • Advanced search
Journal of Neuroscience
  • Log out
  • Log in
  • My Cart
Journal of Neuroscience

Advanced Search

Submit a Manuscript
  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Collections
    • Podcast
  • ALERTS
  • FOR AUTHORS
    • Information for Authors
    • Fees
    • Journal Clubs
    • eLetters
    • Submit
    • Special Collections
  • EDITORIAL BOARD
    • Editorial Board
    • ECR Advisory Board
    • Journal Staff
  • ABOUT
    • Overview
    • Advertise
    • For the Media
    • Rights and Permissions
    • Privacy Policy
    • Feedback
    • Accessibility
  • SUBSCRIBE
PreviousNext
Journal Club

The Contribution of Motion-Sensitive Brain Areas to Visual Speech Recognition

Jirka Liessens and Simon Ladouce
Journal of Neuroscience 21 August 2024, 44 (34) e0767242024; https://doi.org/10.1523/JNEUROSCI.0767-24.2024
Jirka Liessens
Department of Brain and Cognition, Leuven Brain Institute, KU Leuven, Leuven, Vlaams-Brabant 3000, Belgium
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jirka Liessens
Simon Ladouce
Department of Brain and Cognition, Leuven Brain Institute, KU Leuven, Leuven, Vlaams-Brabant 3000, Belgium
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Simon Ladouce
  • Article
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF
Loading

The integration of multiple sources of sensory information, particularly visual information such as lip movements, can substantially facilitate comprehension of spoken language. Therefore, although research on verbal comprehension initially focused on the processing of auditory information, the field has since expanded to study a wider range of input modalities, and it now encompasses the integration of both auditory and visual information (Bernstein and Liebenthal, 2014). Consideration of the processing of visual information is essential for addressing questions of speech comprehension such as how individuals compensate for unreliable auditory input (e.g., a noisy environment) or hearing impairments due to auditory system deficiency. The growing evidence that visual processing contributes to speech recognition prompts investigations into the functional and structural neural underpinnings of this contribution.

Visual speech recognition involves interpreting mouth and lip movements to comprehend spoken words. The middle temporal brain area V5 (V5/MT), known for its role in motion processing, has also been linked to visual speech perception. Indeed, both positron emission tomography (PET) and functional magnetic resonance imaging (fMRI) studies have consistently reported V5/MT activation during lip reading (Paulesu et al., 2003; Santi et al., 2003). However, there is no direct evidence that V5/MT is necessary for speech comprehension beyond this association. Furthermore, Mather et al. (2016) found that disruptive transcranial magnetic stimulation (TMS) over V5/MT affected the processing of nonbiological motion but did not impact the processing of biological motion. These findings suggest that while V5/MT is involved in visual speech perception, it is not directly involved in the processing of biological motion. Therefore, further functional studies are needed to clarify the role of V5/MT in motion processing and speech comprehension.

In a recent study, Jeschke et al. (2023) aimed to confirm the necessity of V5/MT in visual speech recognition by applying inhibitory TMS pulses over V5/MT while volunteers performed tasks requiring these functions. Participants received bilateral inhibitory TMS stimulation, either over V5/MT or over the vertex, to disrupt neural processing in a focal area akin to a temporary virtual lesion. Participants performed two tasks, one to assess nonbiological motion and another to assess visual speech recognition. In the first task, participants viewed two displays of randomly moving dots sequentially and had to indicate if the overall direction of movement matched across the two displays. In the second task, participants viewed two muted videos of a female speaker uttering visually distinct vowel–consonant–vowel combinations. Similarly, participants had to indicate whether the combinations in the two videos matched. Accuracy and response times were recorded before TMS was applied, to establish baseline performance, and then participants performed both tasks during inhibitory TMS. After the stimulation, participants performed the behavioral tasks again.

As expected, TMS-mediated inhibition of V5/MT was associated with significantly increased response times in both the visual speech recognition task and the nonbiological motion task relative to stimulation of the vertex. In addition, TMS inhibition of V5/MT was associated with a significantly smaller practice effect (indicated by the ratio of pre-stimulation-to-post-stimulation response times) in both tasks compared with stimulation of the vertex, likely as a result of reduced task performance during stimulation. Taken together, these results suggest that V5/MT causally contributes to visual speech recognition and nonbiological motion processing.

These findings help establish V5/MT as part of the interconnected neural network underlying visual speech recognition (see Fig. 1 for a simplified illustration). Visual speech recognition relies on a network of visual areas related to the processing of both form and motion (Arnal et al., 2011). Previous work showed that V5/MT is part of the dorsal visual stream and is thus associated with motion. Other studies have linked visual speech recognition to activation in the posterior superior temporal sulcus (pSTS) and in a region in inferior and posterior to the pSTS known as the temporal visual speech area (TVSA; Paulesu et al., 2003; Arnal et al., 2011; Bernstein and Liebenthal, 2014). The TVSA is primarily associated with the ventral visual stream for form. Bernstein and Liebenthal (2014) proposed that the pSTS integrates visual information from both V5/MT and the TVSA to facilitate the processing of auditory signals by modulating primary auditory cortical areas within Heschl's gyrus. Future work should elucidate the functional connections between subareas of the pSTS and V5/MT, using high spatial resolution imaging methods, such as 7+ tesla fMRI.

Figure 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 1.

Illustration of the connections between V5/MT, TVSA, pSTG/S, and Heschl's gyrus within the visual speech recognition functional network. The V5/MT and TVSA (in blue) have feedforward connections to the pSTG/S (associative area in yellow). The pSTG has an efferent modulatory connection (reflected by the dotted arrow) with primary auditory cortical areas located within the Heschl's gyrus (auditory areas in green) to facilitate the processing of auditory input based on previously integrated visual information. Only brain areas relevant to this Journal Club article are depicted in this illustration of the visual speech recognition network.

An aspect of visual speech recognition that Jeschke et al. (2023) did not address is cross-linguistic differences in how this network functions and how V5/MT contributes to it. Most of the research on visual speech recognition has involved native speakers of the Germanic language family. However, Sekiyama and Burnham (2008) found behavioral differences between Japanese and English native speakers in the use of visual speech information: English native speakers exhibit shorter reaction times for visually congruent speech information than their Japanese counterparts. To further investigate these behavioral differences, Shinozaki et al. (2016) studied functional connectivity during a visual speech recognition task. V5/MT activity correlated more strongly with activity in Heschl's in English speakers than in Japanese speakers. This finding hints toward a distinct contribution of V5/MT to the network depending on the language. It is also possible that motion information from V5/MT is weighed more strongly during multisensory integration in the STS in English speakers than in Japanese speakers. A TMS study considering multiple languages should reveal differences in the effect of TMS stimulation depending on the language, for example, through the comparison of languages that differ in terms of number of visual speech signals (i.e., visemes), as suggested by Shinozaki et al. (2016). English and Japanese would provide a suitable contrast for future studies.

In conclusion, Jeschke et al. (2023) found that inhibitory TMS over V5/MT increased response times and decreased practice effects during visual speech recognition. Future research should examine what the exact contribution of V5/MT is within the larger visual speech recognition network and whether cross-linguistic differences have structural and/or functional effects. More broadly, a better understanding of the role of V5/MT in speech recognition could contribute to building a more comprehensive model of speech processing that could be applied across different languages.

Footnotes

  • We thank Prof. Jonas Obleser, Prof. Céline R. Gillebert, and Teresa Esch for their helpful suggestions during the preparation of this journal club. This work was supported by PEP-D4491-GOH7718 N Odysseus (ZKD4491-05-W01). Mentored by Prof. Jonas Obleser, University of Lübeck; jonas.obleser{at}uni-luebeck.de

  • The authors declare no competing financial interests.

  • Editor’s Note: These short reviews of recent JNeurosci articles, written exclusively by students or postdoctoral fellows, summarize the important findings of the paper and provide additional insight and commentary. If the authors of the highlighted article have written a response to the Journal Club, the response can be found by viewing the Journal Club at www.jneurosci.org. For more information on the format, review process, and purpose of Journal Club articles, please see http://jneurosci.org/content/jneurosci-journal-club.

  • Correspondence should be addressed to Jirka Liessens at jirka.liessens{at}student.kuleuven.be.

SfN exclusive license.

References

  1. ↵
    1. Arnal LH,
    2. Wyart V,
    3. Giraud A-L
    (2011) Transitions in neural oscillations reflect prediction errors generated in audiovisual speech. Nat Neurosci 14:797–801. https://doi.org/10.1038/nn.2810
    OpenUrlCrossRefPubMed
  2. ↵
    1. Bernstein LE,
    2. Liebenthal E
    (2014) Neural pathways for visual speech perception. Front Neurosci 8:386. https://doi.org/10.3389/fnins.2014.00386 pmid:25520611
    OpenUrlCrossRefPubMed
  3. ↵
    1. Jeschke L,
    2. Mathias B,
    3. Von Kriegstein K
    (2023) Inhibitory TMS over visual area V5/MT disrupts visual speech recognition. J Neurosci 43:7690–7699. https://doi.org/10.1523/JNEUROSCI.0975-23.2023 pmid:37848284
    OpenUrlAbstract/FREE Full Text
  4. ↵
    1. Mather G,
    2. Battaglini L,
    3. Campana G
    (2016) TMS reveals flexible use of form and motion cues in biological motion perception. Neuropsychologia 84:193–197. https://doi.org/10.1016/j.neuropsychologia.2016.02.015
    OpenUrlCrossRefPubMed
  5. ↵
    1. Paulesu E,
    2. Perani D,
    3. Blasi V,
    4. Silani G,
    5. Borghese NA,
    6. De Giovanni U,
    7. Sensolo S,
    8. Fazio F
    (2003) A functional-anatomical model for lipreading. J Neurophysiol 90:2005–2013. https://doi.org/10.1152/jn.00926.2002
    OpenUrlCrossRefPubMed
  6. ↵
    1. Santi A,
    2. Servos P,
    3. Vatikiotis-Bateson E,
    4. Kuratate T,
    5. Munhall K
    (2003) Perceiving biological motion: dissociating visible speech from walking. J Cogn Neurosci 15:800–809. https://doi.org/10.1162/089892903322370726
    OpenUrlCrossRefPubMed
  7. ↵
    1. Sekiyama K,
    2. Burnham D
    (2008) Impact of language on development of auditory-visual speech perception. Dev Sci 11:306–320. https://doi.org/10.1111/j.1467-7687.2008.00677.x
    OpenUrlCrossRefPubMed
  8. ↵
    1. Shinozaki J,
    2. Hiroe N,
    3. Sato M,
    4. Nagamine T,
    5. Sekiyama K
    (2016) Impact of language on functional connectivity for audiovisual speech integration. Sci Rep 6:31388. https://doi.org/10.1038/srep31388 pmid:27510407
    OpenUrlPubMed
Back to top

In this issue

The Journal of Neuroscience: 44 (34)
Journal of Neuroscience
Vol. 44, Issue 34
21 Aug 2024
  • Table of Contents
  • About the Cover
  • Index by author
  • Masthead (PDF)
Email

Thank you for sharing this Journal of Neuroscience article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
The Contribution of Motion-Sensitive Brain Areas to Visual Speech Recognition
(Your Name) has forwarded a page to you from Journal of Neuroscience
(Your Name) thought you would be interested in this article in Journal of Neuroscience.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Print
View Full Page PDF
Citation Tools
The Contribution of Motion-Sensitive Brain Areas to Visual Speech Recognition
Jirka Liessens, Simon Ladouce
Journal of Neuroscience 21 August 2024, 44 (34) e0767242024; DOI: 10.1523/JNEUROSCI.0767-24.2024

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Respond to this article
Request Permissions
Share
The Contribution of Motion-Sensitive Brain Areas to Visual Speech Recognition
Jirka Liessens, Simon Ladouce
Journal of Neuroscience 21 August 2024, 44 (34) e0767242024; DOI: 10.1523/JNEUROSCI.0767-24.2024
Twitter logo Facebook logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • Footnotes
    • References
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF

Responses to this article

Respond to this article

Jump to comment:

No eLetters have been published for this article.

Related Articles

Cited By...

More in this TOC Section

  • Frontocentral Neural Dynamics Reflect Decisions about When to Act
  • Neural Representation of Fear Experience Is Shaped by Context
  • Attentional Mechanisms for Learning Feature Combinations
Show more Journal Club
  • Home
  • Alerts
  • Follow SFN on BlueSky
  • Visit Society for Neuroscience on Facebook
  • Follow Society for Neuroscience on Twitter
  • Follow Society for Neuroscience on LinkedIn
  • Visit Society for Neuroscience on Youtube
  • Follow our RSS feeds

Content

  • Early Release
  • Current Issue
  • Issue Archive
  • Collections

Information

  • For Authors
  • For Advertisers
  • For the Media
  • For Subscribers

About

  • About the Journal
  • Editorial Board
  • Privacy Notice
  • Contact
  • Accessibility
(JNeurosci logo)
(SfN logo)

Copyright © 2025 by the Society for Neuroscience.
JNeurosci Online ISSN: 1529-2401

The ideas and opinions expressed in JNeurosci do not necessarily reflect those of SfN or the JNeurosci Editorial Board. Publication of an advertisement or other product mention in JNeurosci should not be construed as an endorsement of the manufacturer’s claims. SfN does not assume any responsibility for any injury and/or damage to persons or property arising from or related to any use of any material contained in JNeurosci.