Elsevier

NeuroImage

Volume 198, September 2019, Pages 170-180
NeuroImage

Inner speech is accompanied by a temporally-precise and content-specific corollary discharge

https://doi.org/10.1016/j.neuroimage.2019.04.038Get rights and content

Abstract

When we move our articulator organs to produce overt speech, the brain generates a corollary discharge that acts to suppress the neural and perceptual responses to our speech sounds. Recent research suggests that inner speech – the silent production of words in one's mind – is also accompanied by a corollary discharge. Here, we show that this corollary discharge contains information about the temporal and physical properties of inner speech. In two experiments, participants produced an inner phoneme at a precisely-defined moment in time. An audible phoneme was presented 300 ms before, concurrently with, or 300 ms after participants produced the inner phoneme. We found that producing the inner phoneme attenuated the N1 component of the event-related potential – an index of auditory cortex processing – but only when the inner and audible phonemes occurred concurrently and matched on content. If the audible phoneme was presented before or after the production of the inner phoneme, or if the inner phoneme did not match the content of the audible phoneme, there was no attenuation of the N1. These results suggest that inner speech is accompanied by a temporally-precise and content-specific corollary discharge. We conclude that these results support the notion of a functional equivalence between the neural processes that underlie the production of inner and overt speech, and may provide a platform for identifying inner speech abnormalities in disorders in which they have been putatively associated, such as schizophrenia.

Introduction

As you read this text, you can probably hear your inner voice narrating the words. Inner speech – the silent production of words in one's mind (Alderson-Day and Fernyhough, 2015; Perrone-Bertolotti et al., 2014; Zivin, 1979) – is a core aspect of our mental lives; it is linked to a wide-range of psychological functions, including reading, writing, planning, memory, self-motivation, and problem-solving (Alderson-Day et al., 2018; Morin et al., 2011, 2018; Sokolov et al., 1972). Despite its ubiquity, relatively little is known about the neural processes that underlie the production of inner speech. One influential hypothesis states that inner speech is a special form of overt speech (Feinberg, 1978; Frith, 1987; Jones and Fernyhough, 2007). Evidence for this comes from the observation that the brain regions involved in producing inner speech are similar to those involved in producing overt speech, including auditory, language, and supplementary motor areas (Aleman et al., 2005; McGuire et al., 1996; Palmer et al., 2001; Shergill et al., 2001; Shuster and Lemieux, 2005; Zatorre et al., 1996). According to the internal forward model of overt speech (Miall and Wolpert, 1996), when we move our articulator organs to speak, an efference copy is issued in parallel (Von Holst and Mittelstaedt, 1950). This efference copy forms the basis of a neural prediction – a corollary discharge (Sperry, 1950) – regarding the temporal and physical properties of our speech sounds, which is used to suppress the neural and perceptual responses to those sounds (Crapse and Sommer, 2008; Straka et al., 2018). If inner speech is, in fact, a special form of overt speech, then it should also be accompanied by a temporally-precise and content-specific corollary discharge. The present study investigated this issue.

There is a growing body of research suggesting that inner speech is accompanied by a corollary discharge (Ford and Mathalon, 2004; Scott, 2013; Tian and Poeppel, 2010, 2012; 2013, 2015; Tian et al., 2016, 2018; Whitford et al., 2017; Ylinen et al., 2015). Of particular relevance to the present study is an experiment conducted by Whitford et al. (2017), who introduced a procedure in which participants viewed a ticker-tape-style cue which provided them with precise knowledge about when they would hear an audible phoneme. In the listen condition of their experiment, participants were instructed to passively listen to the audible phoneme; in the inner speech condition, participants were instructed to produce an inner phoneme at the precise moment they heard the audible phoneme. On a random half of the trials in the inner speech condition, the inner and audible phonemes matched on content – this was called the match condition; on the other half of the trials, the inner and audible phonemes did not match on content – this was called the mismatch condition. Whitford et al. (2017) found that producing the inner phoneme attenuated the N1 component of the event-related potential (ERP) – an index of auditory cortex processing (Näätänen and Picton, 1987; Woods, 1995) – compared to passive listening, but only when the inner and audible phonemes matched on content. If the inner phoneme did not match the content of the audible phoneme, there was no attenuation of the N1. These results suggest that inner speech, similar to overt speech (Behroozmand et al., 2009; Behroozmand and Larson, 2011; Eliades and Wang, 2008; Heinks-Maldonado et al., 2005; Houde et al., 2002; Liu et al., 2011; Sitek et al., 2013), is accompanied by a content-specific corollary discharge, in that it contains information about the physical properties of inner speech.

However, when we move our articulator organs to speak, the accompanying corollary discharge is not only content-specific, but also temporally-precise, in that it contains information about the temporal properties of overt speech. Evidence for this comes from studies showing that N1-attenuation can be reduced or abolished by imposing a temporal delay between articulator movement and auditory feedback (Behroozmand et al., 2010, 2016; Chen et al., 2012; for non-speech examples, see Blakemore et al., 1998; Elijah et al., 2016; Oestreich et al., 2016; Whitford et al., 2011). In the present study, we investigated whether inner speech, like overt speech, is accompanied by a temporally-precise and content-specific corollary discharge. To accomplish this, we used the same ticker-tape-style cue introduced by Whitford et al. (2017) to control the time at which participants produced the inner phoneme, and we presented the audible phoneme 300 ms before, concurrently with, or 300 ms after participants produced the inner phoneme – we call these the before, precise, and after conditions, respectively. In Experiment 1, we compared the N1 elicited by the audible phoneme during passive listening and the production of inner speech across the different time delays; in Experiment 2, we compared the N1 elicited by an audible phoneme that either matched or mismatched the inner phoneme across the different time delays. Assuming that inner speech is accompanied by a temporally-precise and content-specific corollary discharge, we hypothesize larger N1-attenuation effects when the timing and content of the inner phoneme matches the audible phoneme compared to when it does not.

Section snippets

Method

Participants. Forty-two students from UNSW Sydney participated in our study for course credit. All participants gave written informed consent prior to the experiment and reported having normal hearing in both ears. Data from three participants were excluded from the analyses due to excessive artefacts in the electroencephalogram (EEG) recording (>75% of epochs meeting the rejection criteria; see ERP processing and ERP analysis). Mean age of the remaining participants, 20 of whom were female and

Results

Behavioural results. Participants rated their subjective performance after each trial with a 5-point Likert scale, with scores ranging from 1, meaning “not at all successful”, to 5, meaning “completely successful”. Participants’ mean ratings were 4.12 (SD = 0.69) in the listen-before condition, 4.60 (SD = 0.46) in the listen-precise condition, 4.30 (SD = 0.56) in the listen-after condition, 3.52 (SD = 0.86) in the inner speech-before condition, 4.37 (SD = 0.74) in the inner speech-precise

Method

Participants. Sixty-one students participated in our study for course credit. Data from six participants were excluded from the analyses due to excessive artefacts in the EEG recording (see ERP processing and ERP analysis). Mean age of the remaining participants, 42 of whom were female and 52 of whom were right-handed, was 20 (SD = 3) years.

Apparatus, stimuli, and procedure. The apparatus, stimuli, and animation were identical to Experiment 1. The experiment consisted of 20 blocks of trials,

Results

Behavioural results. Participants’ mean ratings were 3.92 (SD = 0.80) in the match-before condition, 4.54 (SD = 0.43) in the match-precise condition, 4.38 (SD = 0.52) in the match-after condition, 3.44 (SD = 0.80) in the mismatch-before condition, 4.00 (SD = 0.76) in the mismatch-precise condition, and 4.03 (SD = 0.75) in the mismatch-after condition.

ERP results. Fig. 4a shows the ERPs, Fig. 4b shows the mean amplitudes for the N1 time-window, and Fig. 4c shows the voltage maps for the N1

Discussion

We set out to determine the properties of the corollary discharge associated with inner speech: specifically, whether it contains information about the temporal and physical properties of inner speech. In two experiments, participants produced an inner phoneme at a precisely-defined moment in time, and an audible phoneme was presented 300 ms before, concurrently with, or 300 ms after participants produced the inner phoneme. We found that producing the inner phoneme attenuated the N1, but only

Acknowledgements

This work was supported by the Australian Research Council (DP170103094) and the National Health and Medical Research Council of Australia (APP1090507).

References (68)

  • F. Knolle et al.

    Prediction errors in self- and externally-generated deviants

    Biol. Psychol.

    (2013)
  • M.A. Lebedev et al.

    Brain-machine interfaces: past, present and future

    Trends Neurosci.

    (2006)
  • H. Liu et al.

    Differential effects of perturbation direction and magnitude on the neural processing of voice pitch feedback

    Clin. Neurophysiol.

    (2011)
  • R.C. Miall et al.

    Forward models for physiological motor control

    Neuronal Networks

    (1996)
  • A. Morin et al.

    Self-reported frequency, content, and functions of inner speech

    Social and Behavioural Sciences

    (2011)
  • E.D. Palmer et al.

    An event-related fMRI study of overt and covert word stem completion

    Neuroimage

    (2001)
  • M. Perrone-Bertolotti et al.

    What is that little voice inside my head? Inner speech phenomenology, its role in cognitive performance, and its relation to self-monitoring

    Behav. Brain Res.

    (2014)
  • J. Polich

    Updating P300: an integrative theory of P3a and P3b

    Clin. Neurophysiol.

    (2007)
  • L.I. Shuster et al.

    An fMRI investigation of covertly and overtly produced mono- and multisyllabic words

    Brain Lang.

    (2005)
  • X. Tian et al.

    Mental imagery of speech implicates two mechanisms of perceptual reactivation

    Cortex

    (2016)
  • J. Virtanen et al.

    Replicability of MEG and EEG measures of the auditory N1/N1m-response

    Electroencephalogr. Clin. Neurophysiol.

    (1998)
  • B. Alderson-Day et al.

    Inner speech: development, cognitive functions, phenomenology, and neurobiology

    Psychol. Bull.

    (2015)
  • A. Aleman et al.

    The functional neuroanatomy of metrical stress evaluation of perceived and imagined spoken words

    Cerebr. Cortex

    (2005)
  • S.O. Aliu et al.

    Motor-induced suppression of the auditory cortex

    J. Cogn. Neurosci.

    (2009)
  • R. Behroozmand et al.

    Error-dependent modulation of speech-induced auditory suppression for pitch-shifted voice feedback

    BMC Neurosci.

    (2011)
  • R. Behroozmand et al.

    Time-dependent neural processing of the auditory feedback during voice pitch error detection

    J. Cogn. Neurosci.

    (2010)
  • S.J. Blakemore et al.

    Central cancellation of self-produced tickle sensation

    Nat. Neurosci.

    (1998)
  • D.H. Brainard

    The Psychophysics toolbox

    Spatial Vis.

    (1997)
  • Z. Chen et al.

    Effect of temporal predictability on the neural processing of self-triggered auditory stimulation during vocalization

    BMC Neurosci.

    (2012)
  • S.J. Eliades et al.

    Neural substrates of vocalization feedback monitoring in primate auditory cortex

    Nature

    (2008)
  • I. Feinberg

    Efference copy and corollary discharge: implications for thinking and its disorders

    Schizophr. Bull.

    (1978)
  • P.C. Fletcher et al.

    Perceiving is believing: a Bayesian approach to explaining the positive symptoms of schizophrenia

    Nat. Rev. Neurosci.

    (2009)
  • C.D. Frith

    The positive and negative symptoms of schizophrenia reflect impairments in the perception and initiation of action

    Psychol. Med.

    (1987)
  • T.H. Heinks-Maldonado et al.

    Fine-tuning of auditory cortex during speech production

    Psychophysiology

    (2005)
  • Cited by (0)

    View full text