Visual face-movement sensitive cortex is relevant for auditory-only speech recognition

Cortex. 2015 Jul:68:86-99. doi: 10.1016/j.cortex.2014.11.016. Epub 2014 Dec 23.

Abstract

It is commonly assumed that the recruitment of visual areas during audition is not relevant for performing auditory tasks ('auditory-only view'). According to an alternative view, however, the recruitment of visual cortices is thought to optimize auditory-only task performance ('auditory-visual view'). This alternative view is based on functional magnetic resonance imaging (fMRI) studies. These studies have shown, for example, that even if there is only auditory input available, face-movement sensitive areas within the posterior superior temporal sulcus (pSTS) are involved in understanding what is said (auditory-only speech recognition). This is particularly the case when speakers are known audio-visually, that is, after brief voice-face learning. Here we tested whether the left pSTS involvement is causally related to performance in auditory-only speech recognition when speakers are known by face. To test this hypothesis, we applied cathodal transcranial direct current stimulation (tDCS) to the pSTS during (i) visual-only speech recognition of a speaker known only visually to participants and (ii) auditory-only speech recognition of speakers they learned by voice and face. We defined the cathode as active electrode to down-regulate cortical excitability by hyperpolarization of neurons. tDCS to the pSTS interfered with visual-only speech recognition performance compared to a control group without pSTS stimulation (tDCS to BA6/44 or sham). Critically, compared to controls, pSTS stimulation additionally decreased auditory-only speech recognition performance selectively for voice-face learned speakers. These results are important in two ways. First, they provide direct evidence that the pSTS is causally involved in visual-only speech recognition; this confirms a long-standing prediction of current face-processing models. Secondly, they show that visual face-sensitive pSTS is causally involved in optimizing auditory-only speech recognition. These results are in line with the 'auditory-visual view' of auditory speech perception, which assumes that auditory speech recognition is optimized by using predictions from previously encoded speaker-specific audio-visual internal models.

Keywords: Auditory; Lip-reading; Prediction; Speech; pSTS; tDCS.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Auditory Perception / physiology*
  • Cerebral Cortex / physiology
  • Electrodes
  • Face / physiology*
  • Female
  • Humans
  • Magnetic Resonance Imaging
  • Male
  • Models, Neurological
  • Movement / physiology*
  • Neurons / physiology
  • Speech Perception / physiology*
  • Transcranial Direct Current Stimulation / adverse effects
  • Visual Cortex / physiology*
  • Young Adult