The integration of multiple sources of sensory information, particularly visual information such as lip movements, can substantially facilitate comprehension of spoken language. Therefore, although research on verbal comprehension initially focused on the processing of auditory information, the field has since expanded to study a wider range of input modalities, and it now encompasses the integration of both auditory and visual information (Bernstein and Liebenthal, 2014). Consideration of the processing of visual information is essential for addressing questions of speech comprehension such as how individuals compensate for unreliable auditory input (e.g., a noisy environment) or hearing impairments due to auditory system deficiency. The growing evidence that visual processing contributes to speech recognition prompts investigations into the functional and structural neural underpinnings of this contribution.
Visual speech recognition involves interpreting mouth and lip movements to comprehend spoken words. The middle temporal brain area V5 (V5/MT), known for its role in motion processing, has also been linked to visual speech perception. Indeed, both positron emission tomography (PET) and functional magnetic resonance imaging (fMRI) studies have consistently reported V5/MT activation during lip reading (Paulesu et al., 2003; Santi et al., 2003). However, there is no direct evidence that V5/MT is necessary for speech comprehension beyond this association. Furthermore, Mather et al. (2016) found that disruptive transcranial magnetic stimulation (TMS) over V5/MT affected the processing of nonbiological motion but did not impact the processing of biological motion. These findings suggest that while V5/MT is involved in …
Correspondence should be addressed to Jirka Liessens at jirka.liessens{at}student.kuleuven.be.