Trends in Cognitive Sciences
Volume 18, Issue 9, September 2014, Pages 472-479
Journal home page for Trends in Cognitive Sciences

Review
Dynamic speech representations in the human temporal lobe

https://doi.org/10.1016/j.tics.2014.05.001Get rights and content

Highlights

  • Recent methodological advances reveal underlying information representations.

  • Spectrotemporal regions such as the STG show strong context-dependent responses to speech.

  • Contextual modulation occurs both in situ and through interactive connectivity between regions.

  • Context-dependent representations may give rise to abstract representations of words.

  • Multivariate and machine learning statistics will help uncover how acoustic representations transform into words.

Speech perception requires rapid integration of acoustic input with context-dependent knowledge. Recent methodological advances have allowed researchers to identify underlying information representations in primary and secondary auditory cortex and to examine how context modulates these representations. We review recent studies that focus on contextual modulations of neural activity in the superior temporal gyrus (STG), a major hub for spectrotemporal encoding. Recent findings suggest a highly interactive flow of information processing through the auditory ventral stream, including influences of higher-level linguistic and metalinguistic knowledge, even within individual areas. Such mechanisms may give rise to more abstract representations, such as those for words. We discuss the importance of characterizing representations of context-dependent and dynamic patterns of neural activity in the approach to speech perception research.

Introduction

How does the human brain generate phenomenologically rich representations of words from the complex and noisy acoustic speech signal? This is not a new question, with many of our current theories and observations heavily influenced by those nearly 140 years old 1, 2. In this review, we consider the implications of progress that has been made in redefining some of the issues central to speech perception. Recent advances have allowed researchers to examine the functioning human brain with an unprecedented level of detail, paying particular attention to decoding the representations contained in speech-evoked neural responses 3, 4, 5, an important step beyond localizing task-dependent activity. Combined with a growing and productive interaction between linguistics and neuroscience [6], new recording and analysis methods have created a pivotal moment for understanding the neural basis of speech perception.

Section snippets

Organization of the ventral stream

Human neuroimaging and neurophysiology studies support the concept of an information processing hierarchy for speech perception in the temporal lobe. Responses evoked by speech sounds, words, and sentences show activity that spreads primarily from posterior to anterior temporal areas 7, 8, 9, 10, 11, 12, 13, 14. This dominant direction of information flow is facilitated by anatomical connections between the superior temporal plane and anteroventral temporal areas [15] and is commonly referred

Early cortical auditory encoding

To examine the specific roles that the STG plays in the speech perception hierarchy, it is important to understand the inputs to this region. A large body of work has established important aspects of sensory processing that occur in the ascending auditory system en route to the primary auditory cortex in several mammalian species 25, 26, 27. A1 in humans, located on the posteromedial portion of Heschl's gyrus, is characterized by at least one major tonotopic axis [28]. An important aspect of

Stimulus and early linguistic representations in the STG

Despite it showing stimulus- and context-dependent modulations in neural activity, few would argue that A1 exhibits responses that are specific to speech. By contrast, a major target of primary auditory outputs is the STG, which is one of the best-characterized regions in the speech perception system and which shows responses that suggest the earliest stages of speech-tuned representation. Like its upstream neighbors, the STG is highly sensitive to the spectrotemporal content of the acoustic

Cognitive and linguistic modulation

The studies described thus far provide compelling evidence that the STG is a major hub for sublexical processing in the speech perception hierarchy. Like many other brain regions, responses in the STG are nonlinear not only along physical stimulus dimensions (such as categorical phoneme perception), but also according to complex cognitive contexts and task demands. For example, several recent studies have demonstrated that STG activity is powerfully modulated by the attentional constraints of

Lexical representations in the ventral stream

The studies reviewed above suggest that activity in the temporal lobe during speech perception is nondeterministic. That is, it is impossible to predict activity at a given site with a high degree of precision simply based on the physical characteristics of the stimulus. This principle is a defining feature of abstract representations and historically has made it difficult to study the underlying representations of neural systems beyond early sensory cortices. It also makes it potentially even

Concluding remarks

We have discussed evidence that representations of speech information cannot be understood in a strictly linear or deterministic hierarchical framework, even for spectrotemporal representations in the STG. This presents a challenge for understanding more complex and abstract forms of representation such as words (Box 1), but it also potentially provides a means for major advances in neurolinguistics that parallel those in sensory neuroscience. We believe that machine learning and dynamical

Acknowledgments

M.K.L. was funded by National Institutes of Health (NIH) National Research Service Award F32-DC013486. E.F.C. was funded by the NIH grants R00-NS065120, DP2-OD00862, and R01-DC012379 and the Ester A. and Joseph Klingenstein Foundation.

References (99)

  • T. Zaehle

    Segmental processing in the human auditory dorsal stream

    Brain Res.

    (2008)
  • P.E. Turkeltaub et al.

    Localization of sublexical speech perception components

    Brain Lang.

    (2010)
  • H. Takeichi

    Comprehension of degraded speech sounds with m-sequence modulation: an fMRI study

    Neuroimage

    (2010)
  • D. Poeppel

    The analysis of speech in different temporal integration windows: cerebral lateralization as “asymmetric sampling in time”

    Speech Commun.

    (2003)
  • E.M. Zion Golumbic

    Mechanisms underlying selective neuronal tracking of attended speech at a “cocktail party”

    Neuron

    (2013)
  • M. Sabri

    Attentional and linguistic interactions in speech perception

    Neuroimage

    (2008)
  • C.J. Wild

    Human auditory cortex is sensitive to the perceived clarity of speech

    Neuroimage

    (2012)
  • J.L. McClelland et al.

    The TRACE model of speech perception

    Cogn. Psychol.

    (1986)
  • J.M. McQueen

    Are there really interactive processes in speech perception?

    Trends Cogn. Sci.

    (2006)
  • J.L. McClelland

    Are there interactive processes in speech perception?

    Trends Cogn. Sci.

    (2006)
  • J.L. Elman

    An alternative view of the mental lexicon

    Trends Cogn. Sci.

    (2004)
  • S.C. Creel

    Heeding the voice of experience: the role of talker variation in lexical access

    Cognition

    (2008)
  • T. Kraljic et al.

    Perceptual learning for speech: is there a return to normal?

    Cogn. Psychol.

    (2005)
  • W.D. Marslen-Wilson

    Functional parallelism in spoken word-recognition

    Cognition

    (1987)
  • P. Gagnepain

    Temporal predictive codes for spoken words in auditory cortex

    Curr. Biol.

    (2012)
  • R. Prabhakaran

    An event-related fMRI investigation of phonological–lexical competition

    Neuropsychologia

    (2006)
  • D. Dahan

    Time course of frequency effects in spoken-word recognition: evidence from eye movements

    Cogn. Psychol.

    (2001)
  • S. Lindsay

    Acquiring novel words and their past tenses: evidence from lexical effects on phonetic categorisation

    J. Mem. Lang.

    (2012)
  • A. Yaron

    Sensitivity to complex statistical regularities in rat auditory cortex

    Neuron

    (2012)
  • C. Wernicke

    Der Aphasische Symptomencomplex: Eine Psychologische Studie auf Anatomischer Basis

    (1874)
  • N. Geschwind

    Disconnexion Syndromes in Animals and Man

    (1974)
  • E. Formisano

    Who” is saying “what”? Brain-based decoding of human voice and speech

    Science

    (2008)
  • B.N. Pasley

    Reconstructing speech from human auditory cortex

    PLoS Biol.

    (2012)
  • K.E. Bouchard

    Functional organization of human sensorimotor cortex for speech articulation

    Nature

    (2013)
  • D. Poeppel

    Speech perception at the interface of neurobiology and linguistics

    Philos. Trans. R. Soc. Lond. B: Biol. Sci.

    (2008)
  • Y. Lerner

    Topographic mapping of a hierarchy of temporal receptive windows using a narrated story

    J. Neurosci.

    (2011)
  • I. DeWitt et al.

    Phoneme and word recognition in the auditory ventral stream

    Proc. Natl. Acad. Sci. U.S.A.

    (2012)
  • A.P. Leff

    The cortical dynamics of intelligible speech

    J. Neurosci.

    (2008)
  • K. Okada

    Hierarchical organization of human auditory cortex: evidence from acoustic invariance in the response to intelligible speech

    Cereb. Cortex

    (2010)
  • C.J. Price

    The anatomy of language: a review of 100 fMRI studies published in 2009

    Ann. N. Y. Acad. Sci.

    (2010)
  • B.H. Scott

    Transformation of temporal processing across auditory cortex of awake macaques

    J. Neurophysiol.

    (2011)
  • M. Steinschneider

    Phonemic representations and categories

    Neural Correlates of Auditory Cognition

    (2013)
  • M. Chevillet

    Functional correlates of the anterolateral processing hierarchy in human auditory cortex

    J. Neurosci.

    (2011)
  • G. Hickok et al.

    The cortical organization of speech processing

    Nat. Rev. Neurosci.

    (2007)
  • J.P. Rauschecker et al.

    Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing

    Nat. Neurosci.

    (2009)
  • D. Saur

    Ventral and dorsal pathways for language

    Proc. Natl. Acad. Sci. U.S.A.

    (2008)
  • M.H. Davis et al.

    A complementary systems account of word learning: neural and behavioural evidence

    Philos. Trans. R. Soc. Lond. B: Biol. Sci.

    (2009)
  • M. Bozic

    Bihemispheric foundations for human speech comprehension

    Proc. Natl. Acad. Sci. U.S.A.

    (2010)
  • S. Evans

    The pathways for intelligible speech: multivariate and univariate perspectives

    Cereb. Cortex

    (2013)
  • Cited by (70)

    View all citing articles on Scopus
    View full text