Abstract
Can monkeys learn simple auditory sequences and detect when a new sequence deviates from the stored pattern? Here we tested the predictive-coding hypothesis, which postulates that cortical areas encode internal models of sensory sequences at multiple hierarchical levels, and use these predictive models to detect deviant stimuli. In humans, hierarchical predictive coding has been supported by studies of auditory sequence processing, but it is unclear whether internal hierarchical models of auditory sequences are also available to nonhuman animals. Using fMRI, we evaluated the encoding of auditory regularities in awake monkeys listening to first- and second-order sequence violations. We observed distinct fMRI responses to first-order violations in auditory cortex and to second-order violations in a frontoparietal network, a distinction only demonstrated in conscious humans so far. The results indicate that the capacity to represent and predict the structure of auditory sequences is shared by humans and nonhuman primates.
Introduction
In humans, many brain areas show a reduced activation to predictable sequences and an increased response to unexpected stimuli, thought to reflect a prediction error (Friston, 2005; Meyer and Olson, 2011; Wacongne et al., 2011, 2012; Sanmiguel et al., 2013). But what makes a sequence predictable? Regular sequences of sounds may vary in complexity from simple repetition to higher-order rules. Understanding whether and how auditory sequences are learned, and whether nonhuman primates differ from humans in that respect, is of high interest in cognitive neuroscience. Bayesian predictive-coding theory (Friston, 2005) and experimental evidence (Bekinschtein et al., 2009; Schofield et al., 2009; Wacongne et al., 2011; Cornella et al., 2012) suggest that prediction and deviance detection may be hierarchically organized in the human cortex. However, the existence of several hierarchical levels of auditory prediction in other animals has never been demonstrated.
The capacity of the brain to learn a rule can be evaluated by measuring its response to an unexpected violation of the rule, called a deviant. Human auditory event-related potentials are usually used to detect this violation. Predictive coding models postulate that the human mismatch response does not merely reflect a passive stimulus-specific adaptation (SSA) process, but an active process of predicting the next sound (Garrido et al., 2009). Furthermore, there is evidence that deviance detection is hierarchically organized within the auditory system (Cornella et al., 2012). The mismatch negativity (MMN) reflects a preattentive novelty detection (Näätänen et al., 2001), generated by an automatic change-detection mechanism (Winkler et al., 2009; Bendixen et al., 2012; Wacongne et al., 2012). Attention modulates MMN amplitude (Szymanski et al., 1999), but MMN can persist during inattention, sleep (Atienza et al., 1997), or coma (Fischer et al., 1999). By contrast, the P3b response to auditory novelty is dependent on attention and conscious awareness of the stimulus (Sergent et al., 2005).
The “local-global” paradigm (Bekinschtein et al., 2009) that we adopt here probes auditory sequence processing at two hierarchical levels of deviancy, which are detected by distinct brain systems. Local deviants systematically elicit a mismatch response in auditory cortex (Näätänen et al., 2001; Winkler et al., 2009), whereas global deviants, in humans, evoke a P3b response, associated in fMRI to a broad prefrontoparietal network. The local effect resists inattention, sleep, and coma, but the global effect is only observed in conscious and attentive subjects (Bekinschtein et al., 2009). Only first-level (local) mismatch responses have been observed in nonhuman animals, for instance, monkeys (Javitt et al., 1994; Gil-da-Costa et al., 2013) and rodents (Taaseh et al., 2011). Is the monkey brain capable of simultaneously responding to both types of violations?
Materials and Methods
Animals.
Three rhesus macaques (1 male, 2 females, 4–6 kg, 8–9 years of age) were tested. All procedures were conducted in accordance with the European convention for animal care (86–406), the National Institutes of Health's Guide for the Care and Use of Laboratory Animals, and were approved by the institutional Ethical Committee (CETEA 10-003).
“Local-global” auditory paradigm.
We adapted the “local-global” auditory paradigm described previously (Bekinschtein et al., 2009) (Fig. 1). This paradigm is based on local (within trials) and global (across trials) violations of temporal regularities. At the local level (low/first hierarchical order), a deviant sound can be introduced after a series of 5 identical sounds (e.g., xxxxY, where x is the repeated sound and Y the deviant sound). At the global level (high/s hierarchical order), a sequence of sounds, called the “global standard,” is repeatedly presented for a block of trials (e.g., xxxxY), and then this regularity is violated by rare sequences called “global deviants” (e.g., xxxxx). Each trial is made of five consecutive sounds (either a high pitch 1600 Hz, or a low pitch 800 Hz, 70 dB, 50 ms duration, 150 ms stimulus onset asynchrony between sounds, total duration of 650 ms). The series of sounds are separated by 850 ms interstimulus interval, for a total trial duration of 1500 ms. To create a global regular structure, during a given fMRI run, one of the four series of sounds was selected as the “global standard” (i.e., the regular sequence presented in most trials, with only rare “global deviants” where the fifth sound differed) (Fig. 1). Four fMRI runs, presented in random order, tested each of the four possible regular sequences in turn. Over these 4 runs, our experiment followed a 2 × 2 design with orthogonal factors of local regularity (on a given trial, the fifth sound could be different from, or identical to previous sounds) and global regularity (one of the series of sounds was more frequent than the other). The trials were presented in the following run structure: a first period of rest (14.4 s, 6 TRs) followed by 5 series of 24 trials (36 s, 15 TRs), each followed by a period of rest (14.4 s, 6 TRs). Each 24 trials series comprises an initial series of 4 habituation trials (100% global standards), followed by a mixed series of 20 post-habituation trials with 4 rare global deviants (20%), and 16 frequent global standards (80%). Deviant trials were always followed by at least 2 consecutive standard trials. The total duration was 266.4 s or 111TRs. To ensure stimulus novelty during fMRI acquisitions, monkeys listened to the “local-global” paradigm only during the scanning session and not during the training sessions. The global standard changed every 5 min, thus evaluating the fast acquisition of the relevant sequence.
The “local-global” paradigm. a, Local level (first-order): short auditory sequences comprised either 5 identical tones (local standard, denoted as xxxxx) or 4 identical tones followed by a distinct one (local deviant, denoted as xxxxY). b, Global level (second-order): sequences were presented in fMRI runs where one sequence served as the global standard and another as the global deviant.
fMRI data acquisition.
After the injection of MION contrast agent (10 mg/kg, i.v.), monkeys sat in a sphinx position in a chair inside a 3T scanner (Siemens; Tim Trio) (Vanduffel et al., 2001). Whole-brain functional data were acquired using a T2* EPI sequence (TR = 2400 ms, TE = 20 ms, and 1.5 mm3 voxel size).
fMRI analyses.
Functional images were coregistered to the monkey MNI anatomical template (Frey et al., 2011). In total, 148 runs (16,428 volumes, 37 sessions of 4 basic fMRI runs) were analyzed. Whole-brain data were visualized using Caret (version 5.61).
Individual analyses.
The activation time series was modeled, within each fMRI run, by event-related regressors obtained by convolution of the experimental conditions (habituation, global standards, and global deviants) with the canonical hemodynamic response function for MION, and its time derivative. We excluded global standard trials that immediately followed a global deviant trial (Bekinschtein et al., 2009). These trials were modeled by a distinct regressor and its derivative but were not analyzed further.
Group analyses.
For each monkey and each fMRI session (i.e., a group of four fMRI runs), the above first-level SPM model yielded a β-weight image of activation for each condition relative to rest (expressed as a percentage of the whole-brain signal). All of these images were then entered into several second-level whole-brain ANOVAs. The contrasts defined were as follows: activation to all sounds (habituation, frequent and rare sequences) relative to rest, frequent sounds relative to rest, rare sounds relative to rest, local effect (local deviant − local standard sequences), and global effect (rare − frequent sequences). We used a threshold of p < 0.001 uncorrected at the voxel level and report only regions where such voxels grouped together to form a contiguous cluster whose extent was significant at p < 0.05, corrected for multiple comparisons across the brain volume (false detection rate [FDR]).
Plots.
We generated plots by extracting the β-weight of SPM regressions of individual participants' data with the hemodynamic functions of the appropriate stimulus categories, and then plotting the mean and SE of these β-weights. These values estimate, in percentages of the whole-brain fMRI signal, the size of the fMRI activation relative to the implicit rest baseline that separates trials.
Event-related functional correlation: psychophysiological interaction (PPI).
To examine the effect of auditory violation on the functional correlation between A1 and the remaining brain areas, PPI analyses (Friston et al., 1997) were conducted using SPM5 across all sessions from all monkeys. The residual of the above first-level model was extracted in A1. This residual, together with its point-by-point multiplication with the prior regressors for habituation, global standards, and global deviants, was then entered as additional regressors in a novel first-level model. Finally, their β-weights were submitted to the same second-level analysis as above, with the same contrasts allowing us to extract which areas increased their functional correlation to A1 during (e.g., global deviants relative to global standards). The statistical threshold was set at p < 0.05 (FDR-corrected).
Results
Results are expressed at the group level. Overall, findings were consistent across monkeys with minor differences.
fMRI activations for auditory stimuli
We first examined the group activations relative to rest. Pooling overall stimuli (Fig. 2a), we observed bilateral fMRI activations within the auditory cortex, including core (A1, R), belt, and parabelt regions (left: t = 10.85, pFDR = 0, puncorrected = 4.4 × 10−16; right: t = 11.49, pFDR = 0, puncorrected = 4.4 × 10−16) and inferior colliculus (left: t = 3.30, pFDR = 0.019; right: t = 3.33, pFDR = 0.018), corresponding to the auditory pathway. During the test period, global standard sequences, which were frequent and predictable, caused a detectable activation relative to rest only in A1. By contrast, the rare global deviants activated bilaterally, not only auditory cortex (core, belt, and parabelt regions; left: t = 9.15, pFDR = 0, puncorrected = 4.4 × 10−16; right: t = 9.35, pFDR = 0, puncorrected = 4.4 × 10−16), but also anterior cingulate (t = 3.97, pFDR = 0.003), striatum (right, t = 4.04, pFDR = 0.002), globus pallidus (left: t = 3.70, pFDR = 0.006; right: t = 3.08, pFDR = 0.029), thalamus (t = 3.79, pFDR = 0.005), prefrontal area 8A (left: t = 4.20, pFDR = 0.001; right: t = 5.90, pFDR = 1.06 × 10−6), premotor area 6V (left: t = 6.24, pFDR = 2 × 10−7; right: t = 4.61, pFDR = 2.8 × 10−4), left parietal cortex (ventral intraparietal area [VIP]) (t = 5.82, pFDR = 1.5 × 10−6), left temporoparietal area (TPt) (t = 3.61, pFDR = 0.008), right hippocampus (t = 4.06, pFDR = 0.002), and cerebellar dentate nuclei (left: t = 4.32, pFDR = 0.001; right: t = 4.24, pFDR = 0.001) (Fig. 2b).
fMRI activations for auditory stimuli. a, SPM maps for all sounds (a) and rare sounds (b). y, level of coronal section (Paxinos atlas). Group analysis: p < 0.05 (FDR-corrected).
Local and global novelties
We evaluated the significance and hierarchical organization of these activations by testing separately for local and global effects. The local effect is the contrast between local deviants and local standards (xxxxY vs xxxxx sequences). In agreement with prior electrophysiological studies (Javitt et al., 1994), it activated the auditory cortex bilaterally (left: A1, rostral core region and caudomedial belt region of the auditory cortex, t = 5.15, pFDR = 0.005; right: A1 and caudomedial belt region of the auditory cortex, t = 3.5, pFDR = 0.045; rostral core region, t = 3.59, puncorrected = 1.95 × 10−4). We also found local responses in the anterior cingulate gyrus (area 25) (t = 3.42, pFDR = 0.048, group level; although at the single level, with a threshold of puncorrected < 0.001, only one monkey activated area 25), medial geniculate nucleus (left: t = 5.5, pFDR = 0.002; right: t = 3.95, pFDR = 0.025), the striatum (left dorsal putamen: t = 4.63, pFDR = 0.010; right caudate: t = 4.43, pFDR = 0.012; left caudate: t = 3.72, pFDR = 0.033), the dorsal thalamus (t = 4.09, pFDR = 0.020), medial superior temporal area (right: t = 4.57, pFDR = 0.01; left: t = 4.73, pFDR = 0.01), and area V4 (t = 4.27, pFDR = 0.015) (Fig. 3).
Local novelty activates the auditory pathway and basal ganglia in the macaque brain. a, Activation maps for local deviants minus local standards (xxxxY sequences − xxxxx sequences). b, fMRI signal change in areas responsive to local novelty (blue cross on SPM maps). Plots show signal change for habituation (hab), frequent (freq), and rare stimuli. y, level of coronal section (Paxinos atlas). Group analysis: p < 0.05 (FDR-corrected).
The global effect, contrasting rare trials minus frequent trials, activated a distributed cortical network, including the anterior cingulate cortex (area 24) (t = 3.71, pFDR = 0.038, group level; although at the single level, with a threshold of puncorrected<0.001, only one monkey activated area 24), the bilateral auditory cortex (left: area A1, t = 4.14, pFDR = 0.017; right: A1 and caudomedial belt region of the auditory cortex, t = 4.51, pFDR = 0.008), striatum (right, t = 3.91, pFDR = 0.025), prefrontal areas 8A (left: t = 5.31, pFDR = 0.003; right: t = 4.79, pFDR = 0.006), premotor areas 6V (left: t = 5.31, pFDR = 0.003; right: t = 4.38, pFDR = 0.011), left parietal cortex (VIP) (t = 4.97, pFDR = 0.005), and temporoparietal area TPt (t = 5.09, pFDR = 0.004) (Fig. 4a,b). These areas showed a global response, even on xxxxY blocks, where the rare deviants were monotonous xxxxx sequences.
Global novelty activates a frontoparietal network in the macaque brain. a, Activation maps for rare minus frequent sequences. b, fMRI signal change in areas responsive to global novelty (blue cross on SPM maps). Plots show signal change for habituation (hab), frequent (freq), and rare stimuli. y, level of coronal section (Paxinos atlas). At the threshold of p < 0.001 uncorrected, only one monkey activated ACC. c, Task-evoked connectivity during global novelty, using a seed in the right A1 and looking for psychophysiological interaction. ACC, Anterior cingulate cortex; TPt, temporoparietal area; IPS, intraparietal sulcus; STS, superior temporal sulcus. Group analysis: p < 0.05 (FDR-corrected).
Error propagation across the monkey brain
The predictive coding hypothesis predicts that error signals evoked by unpredicted tones in auditory cortex should propagate differently to other regions depending on the hierarchical level that they violate (Friston, 2005). For local deviants, this propagation should be limited to auditory cortex, whereas for global deviants it should expand to higher areas. We tested this hypothesis using an event-related functional correlation analysis (PPI), which examined how the functional correlation of any brain area to A1 was modulated by local and, separately, by global effects. For local deviants (compared with local standards), no increase in functional correlation was observed anywhere. For global deviants (compared with global standards), a strong increase in functional correlation was found between A1 and the posterior cingulate cortex/precuneus (t = 4.82, pFDR = 0.002), the right striatum (right: t = 4.07, pFDR = 0.009), prefrontal area 8A (left: t = 4.09, pFDR = 0.009; right: t = 3.62, pFDR = 0.020), premotor areas 6V (left: t = 4.34, pFDR = 0.006; right: t = 3.70, pFDR = 0.017) and 6M (left: t = 3.74, pFDR = 0.016; right: t = 3.22, pFDR = 0.036), primary motor cortex (F1(4)) (left: t = 4.51, pFDR = 0.004; right: t = 3.12, pFDR = 0.041), right agranular insula (t = 3.32, pFDR = 0.031), intraparietal sulcus (left: t = 6.00, pFDR = 1.46 × 10−4; right: t = 4.63, pFDR = 0.004), medial parietal cortex (t = 4.82, pFDR = 0.002), the dorsal bank of superior temporal sulcus (t = 4.04, pFDR = 0.009), and visual areas V4/TEO (left: t = 4.59, pFDR = 0.004; right: t = 4.83, pFDR = 0.002) (Fig. 4c).
Discussion
Our results indicate that novelty detection, a fundamental mechanism by which the brain adjusts its internal models, is organized hierarchically in the monkey brain.
First, at the local level, after the repetition of four identical tones, oddball tones deviating in pitch activated the auditory cortex, thalamus, and striatum. These regions may systematically encode stimulus frequencies and/or transition probabilities (Meyer and Olson, 2011; Wacongne et al., 2012). The identification of a local mismatch response in these sensory regions is consistent with previous human imaging studies (Opitz et al., 1999; Sabri et al., 2004; Schönwiesner et al., 2007; Bekinschtein et al., 2009) and nonhuman primate electrophysiology (Javitt et al., 1994; Gil-da-Costa et al., 2013), although our study is the first to explore mismatch responses in monkeys at the whole-brain level. There is debate as to whether the MMN can be explained exclusively by a passive process of SSA (May and Tiitinen, 2010) or denotes an active predictive-coding mechanism (Czigler et al., 2007; Näätänen et al., 2007; Wacongne et al., 2012). Electrophysiological recordings in A1 revealed a decrease in neural responses to repeated sounds and a recovery for rare deviants (Taaseh et al., 2011), but this phenomenon seems to arise primarily from SSA as it occurs identically in both predictable and unpredictable contexts (Fishman and Steinschneider, 2012). SSA also occurs in subcortical auditory pathways, such as medial geniculate body (Anderson et al., 2009; Yu et al., 2009; Antunes et al., 2010) and inferior colliculus (Pérez-González et al., 2005; Malmierca et al., 2009). However, the MMN does not entirely reduce to SSA (Farley et al., 2010). Using fMRI alone, we cannot tell whether the responses to local novelty reflect SSA, predictive coding, or both. However, our past research with the same paradigm showed that novelty responses remain present when the final sound is omitted, which can only be explained by predictive coding (Wacongne et al., 2011). Electrophysiology study in awake macaques proposes that SSA dominates in A1, whereas deviance detection would arise outside of A1 (Fishman and Steinschneider, 2012). In our study, whereas frequent sequences elicited a significant activation only in A1, local deviants activated a network beyond A1 (Fig. 3), probably reflecting prediction errors. Future monkey studies should combine electrophysiology with an omission paradigm, to disentangle which of our local effect fMRI responses reflect bottom-up SSA, a capacity for deviance detection, or a top-down influence from novelty signals computed elsewhere in the cortex.
By contrast, with this first-order novelty effect, second-order violations in the overall sequence pattern caused activations way beyond the classical auditory pathway and recruited a “global workspace” network comprising higher-order prefrontal, cingulated, and parietal regions. These results supplement prior findings of a prefrontal and striatal encoding of complex motor sequences (Fujii and Graybiel, 2003). In monkeys, prefrontal cortex is a key region for the production of temporally ordered behavioral sequences (Fujii and Graybiel, 2003; Shima et al., 2007) and for cross-modal and cross-temporal integration (Fuster et al., 2000).
The global effect observed here with fMRI shares the same functional profile as the P3b response in human ERPs. Using the same paradigm of the human study (Bekinschtein et al., 2009), we found similar fMRI networks activated by local and global deviants in both species. The human study clearly showed that the fMRI areas responsive to global violations were plausible generators of the scalp P3b component. Monkey homologs of the scalp P3 response have also been obtained from the oddball paradigm (Paller et al., 1988; Gil-da-Costa et al., 2013), although this design cannot separate P3a and P3b components. Future simultaneous EEG-fMRI recordings should clarify whether a homolog of the P3b response exists in monkeys.
Our results imply that a sophisticated and hierarchical machinery for predictive coding evolved early on in the primate lineage. They concur with Bayesian predictive-coding theory (Friston, 2005), according to which capacities for prediction and error detection are replicated at multiple levels of the cortical hierarchy. Learning of transition probabilities has been previously observed in monkey inferotemporal cortex (Meyer and Olson, 2011) and may explain the local mismatch effect (Wacongne et al., 2012), but not the detection of global deviants. The latter implies a memory of an entire sequence of stimuli and a capacity to compare two sequences across a short time gap. In particular, identifying a monotonic sequence xxxxx as a global deviant among frequent xxxxY sequences requires expecting a distinctive end sound Y and noticing its absence. It is striking that monkeys can perform this simple yet abstract operation.
The response to global deviants systematically disappears whenever humans are made unaware of the global regularity, due to inattention, distraction, coma, or vegetative state, and the presence of a global effect in the present local-global paradigm has been proposed as a test of preserved consciousness during recovery from coma (Bekinschtein et al., 2009; Faugeras et al., 2012). This reasoning tentatively suggests that monkeys may be aware of the global regularity and that the activation of their prefrontal cortex and interconnected areas forming a “global workspace” network indicates a conscious recognition of the rare global deviants. Although clearly speculative, the idea that a basic level of sensory awareness exists in monkeys fits with several prior studies of the behavioral and neural correlates of consciousness in monkeys, which indicate that they too experience blindsight (Cowey and Stoerig, 1995), masking (Macknik and Haglund, 1999; Lamme et al., 2002), and binocular rivalry (Leopold and Logothetis, 1999), with properties similar to humans and that they also possess a meta-cognitive self-knowledge of their confidence and errors (Terrace and Son, 2009).
Our findings may also contribute to the ongoing debate about the putative evolutionary precursors of language (Petkov and Jarvis, 2012). Strikingly, monkeys listening to second-order sequence violations robustly activated their inferior frontal area 8A bilaterally, as well as the left temporoparietal area TPt, a putative macaque homolog of the human temporoparietal language area (Spocter et al., 2010). Variants of the present method, using sequences of variable internal complexity (including, e.g., the abstract template “aaaab,” where a and b could be any two distinct sounds), could shed light on whether a form of proto-syntax is present in nonhuman primates (Fitch and Friederici, 2012).
Notes
Supplemental material for this article is available at https://www.dropbox.com/sh/rsvm46wenph073u/k42eKWjZxt. Supplemental methods, results, discussion. Individual data. This material has not been peer reviewed.
Footnotes
This work was supported by the European Research Council (ERC NeuroConsc) to S.D., Inserm Avenir program to B.J., Collège de France, CEA Neurospin, and Bettencourt-Schueller Foundation. We thank Olivier Joly for contributing to data collection and analysis; David Janssen for code writing; Naoki Tani and Martin Bataille for help with behavioral experiments; Alexis Amadon, Hauke Kolster, Laurent Laribière, and the NeuroSpin MRI and informatics teams for help with imaging tools; Christophe Joubert, Elodie Bouchoux, and Jean-Marie Helies for animal facilities; Michel Bottlaender, Jean-Robert Deverre, and Denis Le Bihan for support; and Philippe Robert from Guerbet Research (Aulnay-Sous-Bois, France) for providing the iron oxide contrast agent.
The authors declare no competing financial interests.
- Correspondence should be addressed to Dr. Béchir Jarraya, Neurospin Imaging Center, Inserm Avenir, CEA Saclay, Bat 145, 91191 Gif-sur-Yvette, France. bechir.jarraya{at}cea.fr