Inner speech is accompanied by a temporally-precise and content-specific corollary discharge
Introduction
As you read this text, you can probably hear your inner voice narrating the words. Inner speech – the silent production of words in one's mind (Alderson-Day and Fernyhough, 2015; Perrone-Bertolotti et al., 2014; Zivin, 1979) – is a core aspect of our mental lives; it is linked to a wide-range of psychological functions, including reading, writing, planning, memory, self-motivation, and problem-solving (Alderson-Day et al., 2018; Morin et al., 2011, 2018; Sokolov et al., 1972). Despite its ubiquity, relatively little is known about the neural processes that underlie the production of inner speech. One influential hypothesis states that inner speech is a special form of overt speech (Feinberg, 1978; Frith, 1987; Jones and Fernyhough, 2007). Evidence for this comes from the observation that the brain regions involved in producing inner speech are similar to those involved in producing overt speech, including auditory, language, and supplementary motor areas (Aleman et al., 2005; McGuire et al., 1996; Palmer et al., 2001; Shergill et al., 2001; Shuster and Lemieux, 2005; Zatorre et al., 1996). According to the internal forward model of overt speech (Miall and Wolpert, 1996), when we move our articulator organs to speak, an efference copy is issued in parallel (Von Holst and Mittelstaedt, 1950). This efference copy forms the basis of a neural prediction – a corollary discharge (Sperry, 1950) – regarding the temporal and physical properties of our speech sounds, which is used to suppress the neural and perceptual responses to those sounds (Crapse and Sommer, 2008; Straka et al., 2018). If inner speech is, in fact, a special form of overt speech, then it should also be accompanied by a temporally-precise and content-specific corollary discharge. The present study investigated this issue.
There is a growing body of research suggesting that inner speech is accompanied by a corollary discharge (Ford and Mathalon, 2004; Scott, 2013; Tian and Poeppel, 2010, 2012; 2013, 2015; Tian et al., 2016, 2018; Whitford et al., 2017; Ylinen et al., 2015). Of particular relevance to the present study is an experiment conducted by Whitford et al. (2017), who introduced a procedure in which participants viewed a ticker-tape-style cue which provided them with precise knowledge about when they would hear an audible phoneme. In the listen condition of their experiment, participants were instructed to passively listen to the audible phoneme; in the inner speech condition, participants were instructed to produce an inner phoneme at the precise moment they heard the audible phoneme. On a random half of the trials in the inner speech condition, the inner and audible phonemes matched on content – this was called the match condition; on the other half of the trials, the inner and audible phonemes did not match on content – this was called the mismatch condition. Whitford et al. (2017) found that producing the inner phoneme attenuated the N1 component of the event-related potential (ERP) – an index of auditory cortex processing (Näätänen and Picton, 1987; Woods, 1995) – compared to passive listening, but only when the inner and audible phonemes matched on content. If the inner phoneme did not match the content of the audible phoneme, there was no attenuation of the N1. These results suggest that inner speech, similar to overt speech (Behroozmand et al., 2009; Behroozmand and Larson, 2011; Eliades and Wang, 2008; Heinks-Maldonado et al., 2005; Houde et al., 2002; Liu et al., 2011; Sitek et al., 2013), is accompanied by a content-specific corollary discharge, in that it contains information about the physical properties of inner speech.
However, when we move our articulator organs to speak, the accompanying corollary discharge is not only content-specific, but also temporally-precise, in that it contains information about the temporal properties of overt speech. Evidence for this comes from studies showing that N1-attenuation can be reduced or abolished by imposing a temporal delay between articulator movement and auditory feedback (Behroozmand et al., 2010, 2016; Chen et al., 2012; for non-speech examples, see Blakemore et al., 1998; Elijah et al., 2016; Oestreich et al., 2016; Whitford et al., 2011). In the present study, we investigated whether inner speech, like overt speech, is accompanied by a temporally-precise and content-specific corollary discharge. To accomplish this, we used the same ticker-tape-style cue introduced by Whitford et al. (2017) to control the time at which participants produced the inner phoneme, and we presented the audible phoneme 300 ms before, concurrently with, or 300 ms after participants produced the inner phoneme – we call these the before, precise, and after conditions, respectively. In Experiment 1, we compared the N1 elicited by the audible phoneme during passive listening and the production of inner speech across the different time delays; in Experiment 2, we compared the N1 elicited by an audible phoneme that either matched or mismatched the inner phoneme across the different time delays. Assuming that inner speech is accompanied by a temporally-precise and content-specific corollary discharge, we hypothesize larger N1-attenuation effects when the timing and content of the inner phoneme matches the audible phoneme compared to when it does not.
Section snippets
Method
Participants. Forty-two students from UNSW Sydney participated in our study for course credit. All participants gave written informed consent prior to the experiment and reported having normal hearing in both ears. Data from three participants were excluded from the analyses due to excessive artefacts in the electroencephalogram (EEG) recording (>75% of epochs meeting the rejection criteria; see ERP processing and ERP analysis). Mean age of the remaining participants, 20 of whom were female and
Results
Behavioural results. Participants rated their subjective performance after each trial with a 5-point Likert scale, with scores ranging from 1, meaning “not at all successful”, to 5, meaning “completely successful”. Participants’ mean ratings were 4.12 (SD = 0.69) in the listen-before condition, 4.60 (SD = 0.46) in the listen-precise condition, 4.30 (SD = 0.56) in the listen-after condition, 3.52 (SD = 0.86) in the inner speech-before condition, 4.37 (SD = 0.74) in the inner speech-precise
Method
Participants. Sixty-one students participated in our study for course credit. Data from six participants were excluded from the analyses due to excessive artefacts in the EEG recording (see ERP processing and ERP analysis). Mean age of the remaining participants, 42 of whom were female and 52 of whom were right-handed, was 20 (SD = 3) years.
Apparatus, stimuli, and procedure. The apparatus, stimuli, and animation were identical to Experiment 1. The experiment consisted of 20 blocks of trials,
Results
Behavioural results. Participants’ mean ratings were 3.92 (SD = 0.80) in the match-before condition, 4.54 (SD = 0.43) in the match-precise condition, 4.38 (SD = 0.52) in the match-after condition, 3.44 (SD = 0.80) in the mismatch-before condition, 4.00 (SD = 0.76) in the mismatch-precise condition, and 4.03 (SD = 0.75) in the mismatch-after condition.
ERP results. Fig. 4a shows the ERPs, Fig. 4b shows the mean amplitudes for the N1 time-window, and Fig. 4c shows the voltage maps for the N1
Discussion
We set out to determine the properties of the corollary discharge associated with inner speech: specifically, whether it contains information about the temporal and physical properties of inner speech. In two experiments, participants produced an inner phoneme at a precisely-defined moment in time, and an audible phoneme was presented 300 ms before, concurrently with, or 300 ms after participants produced the inner phoneme. We found that producing the inner phoneme attenuated the N1, but only
Acknowledgements
This work was supported by the Australian Research Council (DP170103094) and the National Health and Medical Research Council of Australia (APP1090507).
References (68)
- et al.
The varieties of inner speech questionnaire – revised (VISQ-R): replicating and refining links between inner speech and psychopathology
Conscious. Cognit.
(2018) - et al.
Suppression of the auditory N1 event-related potential component with unpredictable self-initiated tones: evidence for internal forward models with dynamic stimulation
Int. J. Psychophysiol.
(2008) - et al.
Vocalization-induced enhancement of the auditory cortex responsiveness during voice F0 feedback perturbation
Clin. Neurophysiol.
(2009) - et al.
A temporal predictive code for voice motor control: evidence from ERP and behavioral responses to pitch-shifted auditory feedback
Brain Res.
(2016) - et al.
Corollary discharge circuits in the primate brain
Curr. Opin. Neurobiol.
(2008) - et al.
A review of the evidence for P2 being an independent component process: age, sleep and modality
Clin. Neurophysiol.
(2004) - et al.
Modifying temporal expectations: changing cortical responsivity to delayed self-initiated sensations with training
Biol. Psychol.
(2016) - et al.
Electrophysiological evidence of corollary discharge dysfunction in schizophrenia during talking and thinking
J. Psychiatr. Res.
(2004) - et al.
A new method for off-line removal of ocular artifact
Electroencephalogr. Clin. Neurophysiol.
(1983) - et al.
Thought as action: inner speech, self-monitoring, and auditory verbal hallucinations
Conscious. Cognit.
(2007)