Representation of speech in human auditory cortex: is it special?

Hear Res. 2013 Nov:305:57-73. doi: 10.1016/j.heares.2013.05.013. Epub 2013 Jun 18.

Abstract

Successful categorization of phonemes in speech requires that the brain analyze the acoustic signal along both spectral and temporal dimensions. Neural encoding of the stimulus amplitude envelope is critical for parsing the speech stream into syllabic units. Encoding of voice onset time (VOT) and place of articulation (POA), cues necessary for determining phonemic identity, occurs within shorter time frames. An unresolved question is whether the neural representation of speech is based on processing mechanisms that are unique to humans and shaped by learning and experience, or is based on rules governing general auditory processing that are also present in non-human animals. This question was examined by comparing the neural activity elicited by speech and other complex vocalizations in primary auditory cortex of macaques, who are limited vocal learners, with that in Heschl's gyrus, the putative location of primary auditory cortex in humans. Entrainment to the amplitude envelope is neither specific to humans nor to human speech. VOT is represented by responses time-locked to consonant release and voicing onset in both humans and monkeys. Temporal representation of VOT is observed both for isolated syllables and for syllables embedded in the more naturalistic context of running speech. The fundamental frequency of male speakers is represented by more rapid neural activity phase-locked to the glottal pulsation rate in both humans and monkeys. In both species, the differential representation of stop consonants varying in their POA can be predicted by the relationship between the frequency selectivity of neurons and the onset spectra of the speech sounds. These findings indicate that the neurophysiology of primary auditory cortex is similar in monkeys and humans despite their vastly different experience with human speech, and that Heschl's gyrus is engaged in general auditory, and not language-specific, processing. This article is part of a Special Issue entitled "Communication Sounds and the Brain: New Directions and Perspectives".

Keywords: A1; AEP; BF; CSD; CV; ERBP; FRF; HG; Heschl's gyrus; MEG; MUA; POA; SRCSD; STG; VOT; averaged evoked potential; best frequency; consonant–vowel; current source density; event-related-band-power; frequency response function; magnetoencephalographic; multiunit activity; place of articulation; primary auditory cortex; summed rectified current source density; superior temporal gyrus; tBMF; temporal best modulation frequency; voice onset time.

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Acoustic Stimulation
  • Adult
  • Animals
  • Audiometry, Pure-Tone
  • Audiometry, Speech
  • Auditory Cortex / physiopathology*
  • Cues
  • Electrocardiography
  • Epilepsy / diagnosis
  • Epilepsy / physiopathology*
  • Epilepsy / psychology
  • Evoked Potentials, Auditory
  • Humans
  • Macaca fascicularis / physiology*
  • Male
  • Pattern Recognition, Physiological
  • Phonetics
  • Recognition, Psychology
  • Sound Spectrography
  • Species Specificity
  • Speech Acoustics*
  • Speech Perception*
  • Time Factors
  • Time Perception
  • Vocalization, Animal*
  • Voice Quality*