Abstract
Recent theories of autism propose that a core deficit in autism would be a less context-sensitive weighting of prediction errors. There is also first support for this hypothesis on an early sensory level. However, an open question is whether this decreased context sensitivity is caused by faster updating of one's model of the world (i.e., higher weighting of new information), proposed by predictive coding theories, or slower model updating. Here, we differentiated between these two hypotheses by investigating how first impressions shape the mismatch negativity (MMN), reflecting early sensory prediction error processing. An autism and matched control group of human adults (both n = 27, 8 female) were compared on the multi-timescale MMN paradigm, in which tones were presented that were either standard (frequently occurring) or deviant (rare), and these roles reversed every block. A well-replicated observation is that the initial model (i.e., the standard and deviant sound in the first block) influences MMN amplitudes in later blocks. If autism is characterized by faster model updating, and thus a smaller primacy bias, we hypothesized (and demonstrate using a simple reinforcement learning model) that their MMN amplitudes should be less influenced by the initial context. In line with this hypothesis, we found that MMN responses in the autism group did not differ between the initial deviant and initial standard sounds as they did in the control group. These findings are consistent with the idea that autism is characterized by faster model updating during early sensory processing, as proposed by predictive coding accounts of autism.
SIGNIFICANCE STATEMENT Recent theories of autism propose that a core deficit in autism is that they are faster to update their models of the world based on new sensory information. Here, we tested this hypothesis by investigating how first impressions shape brain responses during early sensory processing, and hypothesized that individuals with autism would be less influenced by these first impressions. In line with earlier studies, our results show that early sensory processing was influenced by first impressions in a control group. However, this was not the case in an autism group. This suggests that individuals with autism are faster to abandon their initial model, and is consistent with the proposal that they are faster to update their models of the world.
Introduction
Autism spectrum disorder (henceforth “autism”) is characterized by deficits in social interaction and communication, and nonsocial symptoms like restricted, repetitive patterns of behavior (American Psychiatric Association, 2013). Ever since autism was first described, researchers have tried to identify one core deficit that can account for this heterogeneous set of characteristics. Recent theories, grounded in the predictive coding framework (Lawson et al., 2014; Van de Cruys et al., 2014; for a review see Palmer et al., 2017), proposed that a core deficit in autism might be a problem in the flexible weighting of prediction errors in autism, which could explain a broad range of autism characteristics (Van de Cruys et al., 2014).
According to the predictive coding framework, the brain constantly makes predictions or “models” about the world and processes all incoming sensory information compared with those predictions (Rao and Ballard, 1999; Friston, 2010). The difference between incoming information and predictions is encoded as a surprise signal called a prediction error. The brain then uses this signal to adapt its model of the world. However, the brain needs to distinguish between important prediction errors that should be used to update its model, and less important prediction errors that can be ignored. It is this ability to flexibly adjust the weight or importance a person gives to their prediction errors that is proposed to be impaired in autism (Pellicano and Burr, 2012; van Boxtel and Lu, 2013; Lawson et al., 2014; Van de Cruys et al., 2014; Palmer et al., 2017).
Specifically, predictive coding theories propose that people with autism are generally faster in updating their model of the world based on incoming information (Lawson et al., 2014; Van de Cruys et al., 2014, 2017). However, an inflexible weighting of prediction errors could also be explained by people with autism being slower in updating their models of the world. This would mean that older information that is likely irrelevant in the current context is still taken into account. While this is contrary to predictive coding accounts of autism, I. Lieder et al. (2019) showed that participants with autism weighted recent stimuli less heavily in a perceptual decision-making task. Here, we investigate which of these two explanations can account for our previous finding of decreased context-specific prediction errors (Goris et al., 2018), by investigating how first impressions influence sensory processing in autism.
For this purpose, we rely on the multi-timescale paradigm by Todd et al. (2011, 2013). In this paradigm, participants are presented with two tones that are either standard or deviant, but these probabilities reverse in alternating blocks of sound that make up the sequences. When exposed to these sequences, sound-evoked brain potentials can be used to assess how the brain rapidly and automatically forms a model of the regularity and produces a prediction error signal, called mismatch negativity (MMN) when a deviant tone occurs (Wacongne et al., 2011; F. Lieder et al., 2013). By manipulating the length of the blocks, sequences containing slower changing and faster changing alternations can be used to assess how MMN is weighted by different periods of stability. The multi-timescale paradigm revealed a primacy bias such that the initial model (i.e., the respective tones that were the standard and the deviant, in the first block) influences perception in later blocks (Todd et al., 2011, 2013). Specifically, later blocks with reversed contingencies typically show smaller MMN amplitudes (as opposed to those with similar contingencies), especially if they are longer and provide more time to (re)learn one's model (Todd et al., 2011). Therefore, using this paradigm, we can investigate the speed with which people update their models. If individuals with autism are faster in updating models of sensory information, as suggested by predictive coding theories, we should see a smaller primacy bias (i.e., a reduced influence of initial learning leading to equivalent modulation of MMN amplitudes by the initial model), as we expect them to be less influenced by their older model of the world.
Materials and Methods
Participants
In total, 30 adults with autism and 30 typically developed (TD) adults participated in the study. Three participants in each group were excluded because of excessive α waves in the EEG signal (see below). The final sample size thus consisted of 27 participants with autism (8 female, 5 left-handed, 2 ambidextrous) and 27 TD participants (8 female, 6 left-handed). All participants had a full-scale IQ >80, and reported to be free of hearing problems. Participants in the TD group were screened to have no reported history of neurologic or psychiatric disorders, and have scores below the cutoff on both the Autism Spectrum Quotient (<32, AQ) (Baron-Cohen et al., 2001) and the Social Responsiveness Scale–Adult version (T-score <61, SRS-A) (Constantino, 2002). These are the cutoffs as described in the original questionnaires, meant to screen for autistic traits in an adult population. Total score for the AQ in the TD group was mean = 13.03 (SD = 5.50) and in the autism group mean = 35.72 (SD = 7.88). Total T-score for the SRS-A in the TD group was mean = 46.72 (SD = 5.37) and in the autism group mean = 77.05 (SD = 11.81).
Age ranged from 23 to 50 years (mean = 35.63, SD = 7.54) and did not differ significantly between the two groups (t(51.41) = 0.83, p = 0.41). All participants gave written informed consent before participation and were financially compensated. The study was approved by the local Ghent University ethics committee.
All adults with autism had received a clinical diagnosis of autism spectrum disorder (n = 20), Asperger's syndrome (n = 6), or autistic disorder (n = 1), before the experiment, by an independent clinician or multidisciplinary team. The diagnoses were verified with the Autism Diagnostic Observational Schedule 2 (ADOS-2) (Lord et al., 2000) Module four by a trained researcher using the revised scoring algorithm (Hus and Lord, 2014). In line with earlier autism studies (Magnée et al., 2008; Deschrijver et al., 2016, 2017; Goris et al., 2018), participants with autism were screened to have an ADOS-2 total score of 1 point below the cutoff or higher. Importantly, the results did not change in a statistically significant way when we used the ADOS-2 total score cutoff as the exclusion criterion, which resulted in the exclusion of only one participant. Mean total ADOS score was 13.15 (SD = 3.57).
Intelligence was assessed by using the Kaufman 2 short form Wechsler Adult Intelligence Scale–third edition (WAIS-III), as recommended by Minshew et al. (2005). For 12 participants with autism, we used a WAIS-IV or Kaufman 2 short form WAIS-III that was available and completed within the past 5 years. There was no significant difference in IQ between the two groups (t(50.66) = 1.63, p = 0.11; TD group: mean = 118.00, SD = 12.26; autism group: mean = 112.07, SD = 14.44).
Task and stimuli
The task was an adapted version of the multi-timescale MMN paradigm by Todd et al. (2013), which is an auditory oddball paradigm where the regularities defining standard and oddball sounds change over time, in short versus long time windows (Fig. 1).
In this paradigm, participants were presented via headphones with 1000 Hz pure tones, with durations of 60 and 30 ms at stimulus-onset asynchrony of 300 ms. In each block, one of these tones was presented as the standard tone (probability of 0.875) and the other as the deviant tone (probability of 0.125). These roles alternated after every 480 tone presentations (2.4 min) in the slow change blocks, and after every 160 presentations (0.8 min) in the fast change blocks. A sequence consisted of four slow change blocks followed by 12 fast change blocks. Participants were presented with two sequences. In the second sequence, the roles in the initial block were opposite from that in the initial block of the first sequence. For example, if the 60 ms tone was the deviant tone in the first block of the first sequence, the 30 ms tone would take this role in the first block of the second sequence. Roles in the initial block of the first sequence were also counterbalanced across participants. Prior observations show that the effects are order-driven and not tone-driven (i.e., can be observed equally if the first deviant is a 30ms tone or a 60ms tone, Todd et al., 2013, 2020). There was a 1.5 min break between the slow and fast blocks, and a 5 min break between the two sequences, resulting in a total duration of ∼47 min. All stimuli were presented binaurally with an intensity of 70 dB through EEG-compatible insert earphones (ER-3C, MedCaT) using PsychoPy2 1.85.2. Participants were instructed to watch a silent, subtitled nature documentary and to ignore the sounds.
Procedure
For participants with autism, the ADOS-2 and short form WAIS-III were completed during a first test session on a separate day. Four TD participants also completed the short form WAIS-III in a first session. The other TD participants were assessed with the short form WAIS-III at the end of the EEG session.
During the EEG session, participants first completed the multi-timescale MMN paradigm. After this, they took part in two more EEG experiments with an overall duration of ∼30 min, of which the results will be presented elsewhere. Next, they filled in three questionnaires measuring autistic traits: the AQ (Baron-Cohen et al., 2001; Dutch version: Hoekstra et al., 2008), the SRS-A (Constantino, 2002; Dutch version: Noens et al., 2012), and the Adolescent/Adult Sensory Profile (Brown and Dunn, 2002; Dutch version: Rietman, 2007).
EEG recordings and analyses
EEG activity was recorded at a sample rate of 1024 Hz using an ActiveTwo EEG amp (BioSemi) from 64 active Ag/AgCl scalp electrodes placed according to the international 10–20 system. Additional electrodes were applied at the mastoids, near the canthi and above and below the left eye. The data were referenced online to the common mode sense electrode. Data were recorded in an electrically shielded chamber. Electrode offsets were kept between −30 and 30 µV at all electrodes.
EEG data analysis was performed using BrainVision Analyzer 2.1 and R 3.3.1 in RStudio 1.0.143, using the erpR package for plotting (Arcara and Petrova, 2014). Data were first rereferenced to the average of all scalp electrodes, downsampled to 500 Hz, and filtered with a notch filter at 50 Hz, a high-pass filter at 0.5 Hz (12 dB/oct) (see also, Bekinschtein et al., 2009), and a low-pass filter at 30 Hz (12 dB/oct). Trials were then epoched from 50 ms before tone onset to 300 ms after. Next, ocular artifacts were corrected with the Gratton and Coles method as implemented in BVA. Epochs exceeding an amplitude of ±75 µV at the scalp or mastoid electrodes were rejected. When >10% of epochs were rejected because of amplitudes exceeding ±75 µV at a specific electrode, this electrode was interpolated. A baseline correction was applied from −50 to 0 ms relative to tone onset (Todd et al., 2011, 2013, 2020). Then, the data were rereferenced to the average of the mastoids (as recommended by Kujala et al., 2007b). For two participants, we used only the left or right mastoid, because of excessive noise in the other mastoid. Next, 8 event-related potential (ERP) waveforms for deviant tones and 8 ERP waveforms for standard tones were created (i.e., for the 30 and 60 ms tones in both slow and fast blocks for each of the sequences). The first five trials of each block and the first standard after each deviant were excluded from averages. Finally, the ERPs were filtered with a 20 Hz low-pass filter (12 dB/oct), as recommended for MMN measurement (Kujala et al., 2007b). As mentioned earlier, we excluded three participants in each group because of excessive α waves based on visual inspection of the individual average waveforms. For one participant, we interpolated an electrode in slow blocks of the first sequence only, based on visual inspection. There was no significant difference in the final number of trials between the TD and autism groups (t(47.30) = −0.60, p = 0.55; for number of trials per group in each condition, see Extended Data Fig. 2-1).
Difference waveforms (MMNs) for each condition were created by subtracting the waveform for the tone as a standard from the tone as a deviant. For example, the MMN for the 60 ms tone in the slow blocks was created by subtracting the waveform for the 60 ms tone when it was a standard in the slow blocks from the waveform for the 60 ms when it was a deviant in the slow blocks (Jacobsen and Schröger, 2003). For statistical analyses, MMN amplitudes were then defined as the mean amplitude (±20 ms) surrounding the peak negativity between 100 and 220 ms after stimulus onset in the difference waveforms, for each participant, electrode, and condition separately (similar to Goris et al., 2018). Thus, MMN amplitude was measured in a 40 ms time window around the individual peaks. Based on earlier studies and the topography across groups, we included data of three frontocentral electrodes (F3, Fz, and F4) (Kujala et al., 2007b; Ruiz-Martínez et al., 2020). A repeated-measures multivariate ANOVA (MANOVA; to deal with a possible sphericity violation) was conducted on MMN amplitudes, including group (2 levels: autism or TD) as a between-subject factor and initial role (2 levels: initial deviant and initial standard), speed (2 levels: slow and fast), sequence (2 levels: first and second), and electrode (three levels: F3, Fz, and F4) as within-subject factors. Significant interaction effects were then followed up with post hoc MANOVA's. Bayes factors for the main effects of interest were computed using Inclusion Bayes factors across matched models as implemented in JASP version 0.10 (JASP Team, 2021), using default settings.
For the block analyses reported below, MMN waveforms were created by grouping two blocks together; for example, the MMN for the 60 ms tone in the first two blocks of the slow blocks was created by subtracting the waveform for this tone when it was a standard (e.g., Block 1) from the waveform for when it was a deviant (e.g., Block 2).
Model simulations
To further formalize our predictions, and demonstrate how a change in primacy bias can predict our larger pattern of results, we also decided to simulate these using a simple reinforcement learning model (see also Goris et al., 2021). This model was presented to the same series of “sounds” as the participants in the first sequence. Specifically, the model consisted of two pairs of weights that, on each trial, predicted one of the two sounds. One pair of weights quantified the evidence for Sound A (wrA and wdA) and the other for Sound B (wrB and wdB). The pairs of weights each consisted of one weight that was associated to a regular, fixed learning rate (wr), and one that was associated to an exponentially decaying learning rate (wd), to simulate the primacy bias (see below). The weights were initialized to zero. On each trial, the model decided which sound to predict based on the softmax decision rule, which chooses Option A with the following probability:
In other words, the weights leading to Sound A were divided by the weights leading to Sound A and Sound B. The probability for Sound B was 1 – P(A). Importantly, the degree to which the model was affected by the regular learning rate versus decaying learning rate was determined by the primacy bias parameter p, which can vary between 0 and 1. If p is 1, the model only learns with the regular learning rate, whereas a p of 0 would lead to learning only with a decaying learning rate. One can think of the parameter p as the proportion of neurons in one's neuronal population that is characterized by the one versus the other type of learning rate. This parameter was set to 0.75 for the TD group. As is standard in reinforcement learning models, for each sound, the summed weights were also divided by a temperature parameter t, the result of which was exponentiated. This way, the model most often predicts the most activated sound (exploitation), but occasionally predicts the least activated sound (exploration), with higher temperatures leading to more exploring. The temperature was set to 0.5, but changes to this parameter produced highly similar results. After each trial, the two weights leading to the sound the model predicted, were updated according to the delta learning rule as follows:
When the sound was correctly predicted, reward R was set to 1, and when incorrect it was set to 0. The learning rate a determines the impact of the prediction error R – wn on the weights of a next trial. When the learning rate is high, the model updates its predictions fast in the face of new incoming information, whereas when the learning rate is low, it updates its predictions more slowly. In the TD model, the learning rate of the regular weights was fixed and set to 0.02 in the slow blocks and 0.04 in the fast blocks, in line with previous studies consistently showing an increase in learning rate in more fast-changing environments (e.g., Behrens et al., 2007; Silvetti et al., 2011; Goris et al., 2021). Importantly, to simulate the primacy bias, the learning rate of the decay weights was high at first (i.e., 0.3), but decayed with every trial by multiplying the learning rate of the previous trial by the factor 0.98 (see Fig. 6A).
Crucially, to simulate the difference between the autism and TD group, we changed the primacy bias parameter p in the autism model from 0.75 to 0.95, so that the autism group's learning and predictions were less influenced by the decaying learning rate. The autism group's regular learning rate was adjusted to 0.0175 and 0.0333 in the slow and fast blocks, respectively, to ensure that the average learning rate over all trials (weighted by p) was the same for the TD and autism groups. This way, we ensured that both groups did not differ in learning rate, temperature, or accuracy (TD = 66.25% and autism = 68.55%; maximum accuracy is 87.5%, in case participants would be completely aware of the tone roles and predict the standard tone in all trials), and the only difference between the autism and TD model was the reduced primacy bias in the autism group.
Results
MMN
A complete overview of all effects in the repeated-measures MANOVA can be found in Table 1. The mean MMN amplitudes for each group and each condition are presented in Figure 2. This figure shows clear group differences that seem to disappear for the second sequence, which was confirmed in the analysis. MMN waveforms for the first sequence are presented in Figure 3.
Figure 2-1
Supplementary Figure 2-1. Download Figure 2-1, DOCX file.
The omnibus MANOVA showed a main effect of speed (F(1,52) = 34.27, p < 0.001, ηp2 = 0.40), indicating a larger MMN in slow versus fast changing blocks, suggesting that people, when given more time, build stronger models of what are deviant and what are standard sounds and hence show larger MMN amplitudes. Second, we found a main effect of initial role (F(1,52) = 14.56, p < 0.001, ηp2 = 0.22), indicating that participants show a larger MMN for the sound that was initially presented as the deviant sound than for the sound that was initially the standard sound when it was later presented as a deviant. This suggests that participants have a stronger model when it is in line with the first experienced, primary context model, reflected in larger MMN amplitudes. Third, there was a main effect of sequence (F(1,52) = 14.87, p < 0.001, ηp2 = 0.22), showing that MMN responses decrease over time from the first to second sequence. We also found a significant two-way interaction between group and electrode (F(2,51) = 4.24, p = 0.02, ηp2 = 0.14), suggesting a stronger effect of electrode in the TD compared with the autism group. However, this main group difference in electrode did not interact with any other effect of interest. More importantly, previous studies with this design have shown an interaction between speed and initial role (Todd et al., 2011, 2013, 2014), which often further interacts with sequence (Todd et al., 2013, 2014), as a marker of how people update their model differently depending on whether it matches the first experienced, primary model or not, an observation that is usually most pronounced in the beginning of the experiment and can fade away over time (e.g., Todd et al., 2013, 2014). Across groups, neither this two-way interaction, nor three-way interaction, reached significance (both F(1,52) < 0.35, both p > 0.55).
Instead, the MANOVA showed a significant four-way interaction between group, initial role, sequence, and speed (F(1,52) = 5.60, p = 0.02, ηp2 = 0.10), in line with the idea that people with autism differ from TD subjects in how first impressions shape subsequent sensory processing. Follow-up MANOVAs for each sequence separately indicated a significant three-way interaction between group, speed, and initial role only in the first sequence (F(1,52) = 9.22, p < 0.01, ηp2 = 0.15), but not the second sequence (F(1,52) = 0.75, p = 0.39, ηp2 = 0.01, see Fig. 2), suggesting that there were group differences in model updating, consistent with the idea of a differential primacy bias, during the first part of the experiment. As can be noted from both Figures 2 and 4, there is an outlier amplifying the effect in the TD group. However, the three-way interaction in the first sequence yields the same result without this participant in the TD group (F(1,51) = 7.92, p < 0.01, ηp2 = 0.13). The interaction between initial role and speed was not significant in the second sequence (F(1,52) = 0.59, p = 0.44, ηp2 = 0.01, see Fig. 2), indicating that both groups no longer showed any sign of a primacy bias in the second half of the experiment. The Bayes factor BF10 for the three-way interaction in the first sequence, however, was BF10 = 19.09, indicating strong evidence for the interaction. Therefore, the remainder of the analyses focused on the first sequence only.
The three-way interaction between group, initial role, and speed, in the first sequence showed that people with autism differed from TD subjects, in how their MMN was modulated by the initial roles as well as the speed with which the tone probabilities changed (fast vs slow). In line with previous observations from Todd et al. (2011, 2013, 2014), MMN amplitudes in the TD group were characterized by an initial role by speed interaction (F(1,26) = 3.35, p = 0.08, ηp2 = 0.11). This pattern was reversed in the same interaction effect in the autism group (F(1,26) = 6.87, p = 0.01, ηp2 = 0.21).
To unpack the origins of these group differences, the slow and fast changing blocks were analyzed separately. Specifically, in the slow condition, there was a significant main effect of initial role for the TD group (F(1,26) = 8.73, p < 0.01, ηp2 = 0.25), meaning that MMN amplitudes were higher for initial deviants than for initial standards when they later become deviants, whereas the autism group showed no such difference between initial deviants and initial standards (F(1,26) = 0.22, p = 0.64, ηp2 = 0.01). These differences are clearly evident in Figure 2. This group difference in the effect of initial role in the slow condition was also supported by a group by initial role interaction in the slow condition (F(1,52) = 4.45, p = 0.04, ηp2 = 0.08). However, in the fast condition, both groups were comparable (F(1,52) = 0.00, p = 0.99, ηp2 = 0.00). This suggests that there was a primacy bias in the slow condition for the TD group, which was absent in individuals with autism. To investigate this result as a function of sequential learning, we further explored MMN amplitudes for each block in the first sequence separately.
Analyses per block in the first sequence
To enable a more serial inspection of the potential origin of group differences, separate difference waveforms were computed for each block encountered during Sequence 1. As can be seen in Figure 4, the main group differences are evident in the slow condition blocks where the effect of initial role on MMN amplitudes could be observed across all four blocks in the TD group, suggesting it was not just an effect of the first block showing a larger MMN. When the initial deviant was again presented as a deviant in the third block, the MMN amplitude increased back to its original level. This suggests that participants were more inclined to rely on their initial model, and can thus be interpreted as a bias toward what participants encountered first. Importantly, this bias was completely absent in the autism group where the MMN amplitude was reasonably equivalent throughout. This was supported with a group by initial role interaction (F(1,52) = 6.47, p = 0.01, ηp2 = 0.11), indicating an effect of initial role only in the TD group (F(1,26) = 9.42, p < 0.01, ηp2 = 0.27), and not in the autism group (F(1,26) = 0.01, p = 0.94, ηp2 = 0.00). There were no main or interaction effects of block (grouped together by two blocks, resulting in a factor with 2 levels: the first two vs the last two blocks).
As groups did not differ in the fast condition, it is possible that either both groups do show a primacy bias here or that there is an absence of this bias in both groups. Figure 4 suggests that the primacy bias disappears over time in the fast condition. Indeed, when investigating blocks separately (grouped together by two blocks, resulting in 6 levels), there was an initial role by block interaction (F(5,48) = 3.51, p < 0.01, ηp2 = 0.27), indicating a significant effect of initial role only in the first two blocks (F(1,52) = 20.27, p < 0.001, ηp2 = 0.28), but not in the other blocks. Therefore, rather than still reflecting a primacy bias, the remaining effect of initial role could be because of a heightened MMN after the short break between the slow and fast phase, as well as the unexpected fast reversal in contingencies in the second block of the fast phase. Consistent with our findings above, there were no interactions with group in the fast condition.
Correlations between MMN and autistic traits
The above group analyses showed a clear group difference in primacy bias. In search for further convergent evidence, we also investigated whether this primacy bias effect was related to autistic traits as measured with questionnaires or ADOS-2 interview scores within these groups.
The speed by initial role interaction was quantified as the difference in MMN amplitude between the initial deviant and initial standard in the first slow condition, minus this difference in the first fast condition. This way, higher, positive numbers mean a stronger effect of initial roles in the slow condition compared with the fast condition. In both the TD and autism groups, there were no significant correlations with the questionnaire scores (all |r|<0.29, p > 0.14). However, when studying the ADOS-2 scores (which were only administered in the autism group), there was a marginally significant correlation between speed by initial role interaction and ADOS-2 total score (r = –0.37, p = 0.06, see Fig. 5A), suggesting that the interaction between speed and initial role might decrease with increasing autism symptom severity. Difference in this p value before and after controlling for IQ (in a multiple linear regression) was not more than 0.02.
Following up on this marginally significant correlation between the ADOS-2 score and the interaction effect, we also zoomed in on the correlations with the effect of initial role in the slow condition (i.e., the primacy bias), as this is where we previously found group differences. There was a marginally significant correlation (r = –0.33, p = 0.09, see Fig. 5B), hinting at a decreasing primacy bias with increasing autism symptom severity. Difference in this p value before and after controlling for IQ was not more than 0.03. There was no correlation with the effect of initial role in the fast condition (r = –0.09, p = 0.67).
Summarized, these correlational results hint at a relation between autism symptomatology and reduced primacy bias. However, we have to be careful in interpreting this result, as this relation was only marginally significant and we did not correct for multiple comparisons.
Model simulations
To simulate the observed patterns in MMN in the first sequence, we also ran a reinforcement learning model consisting of two pairs of weights: one with a regular and one with a decaying learning rate for each sound. The primacy bias parameter p determined the degree to which the model was affected by the regular learning rate versus decaying learning rate, with higher values corresponding to less influence of the decaying learning rate. This parameter p was set to 0.75 in the TD group and 0.95 in the autism group, leading to a reduced primacy bias in the autism group. All model details can be found in Materials and Methods.
To simulate the MMN, we focused on the prediction errors experienced on each trial. MMN was defined as the absolute difference between prediction errors on deviant trials and prediction errors on standard trials. Figure 6D shows the simulated MMN averaged over blocks, which matches closely the pattern on Figure 6C, which shows the MMN from the collected EEG data. This indicates that a simple RL model with a regular and decaying learning rate can simulate the MMN responses in the TD group, and that a reduced primacy bias in the autism group can explain the group differences in MMN.
Discussion
Recent theories propose that autism characteristics can be understood from a predictive coding framework (Pellicano and Burr, 2012; Lawson et al., 2014; Van de Cruys et al., 2014; for a review see Palmer et al., 2017). They put forward that a core deficit in autism might be a high, context-insensitive weighting of prediction errors, which would lead to a faster incorporation of new information into modeled predictions in autism, compared with TD participants. This can naturally explain common symptoms of autism, such as hypersensitivity to sensory input, insistence on sameness, and repetitive behavior (as a way to create predictability and reduce the amount of prediction errors that signal the model needs to be updated). On the other hand, slower model updating in autism has also been proposed (e.g. I. Lieder et al., 2019). In the current study, we set out to differentiate between these two hypotheses by investigating how models of the acoustical context are influenced by first impressions. We used the multi-timescale MMN paradigm by Todd et al. (2013), which has consistently shown that the initial model (i.e., which tone was the standard sound, and which the deviant, in a first series of trials) influences perception (reflected in MMN amplitudes) in later series of sounds, termed a primacy bias. Consistent with predictive coding theories, we expected that individuals with autism would be faster in updating models of sensory information and that this would be apparent as a smaller primacy bias (i.e., a reduced modulation of MMN amplitudes by the initial model). However, if autism is characterized by slower model updating, this should result in a larger primacy bias in MMN amplitudes.
Our results showed that, in the slow blocks presented at the beginning of the experiment, the TD group showed larger MMN amplitudes to initial deviant tones compared with initial standard tones when they were later presented as a deviant (consistent with earlier studies using a very similar design, Todd et al., 2011, 2014). This presumably indicates that TD participants confidently recognized the initial acoustic model when it re-emerged, resulting in more distinct responses for the deviant and standard tone and thus a larger MMN amplitude (which is defined as the difference between the responses to the deviant and standard tone). Crucially, this primacy bias was completely absent in the autism group, where we found no difference between the initial deviant and initial standard tone in the slow condition. This suggests that individuals with autism are not influenced by first impressions to the same extent as TD individuals, and can thus be interpreted as an indication of faster model updating (or abandoning of older information) in autism, consistent with predictive coding theories of autism.
In the subsequent fast blocks, there still seemed to be a difference between the initial deviant and initial standard for both groups. However, while we cannot fully exclude the idea that a primacy bias suddenly and surprisingly appears in the autism group there, we believe that this effect is likely because of increased attention after the break (leading to an increased MMN in the very first fast block), as well as the unexpected fast reversal in contingencies in the second block (presumably leading to a reduced MMN). Supporting this interpretation, the effect disappeared after these first two fast blocks for both groups, and is also absent in the second sequence. This is in line with earlier findings that show the primacy bias disappears over time in a third sequence (Todd et al., 2013). This is presumably because participants learn about the sequence structure over the course of the experiment; that is, they learn that initial deviants switch over time and thus that they should not assign a high weight to the initial deviant when listening to the sequences. That the primacy bias effect already disappears by the second sequence here is somewhat surprising but is likely because of a small methodological difference. In previous studies, deviants could not occur sequentially within a block, but this was not prevented in the current study leading to small numbers of sequential deviants within blocks. This could result in a slightly lower level of distinction between the two tone roles. This might have caused the participants to learn earlier to not assign a high importance to the initial deviant when listening to the sequence, and might thus explain the earlier disappearance of the primacy bias, compared with previous studies. It would be interesting for future studies to investigate whether our finding of a reduced primacy bias in autism can be replicated in an experiment in which fast blocks are presented first in a sequence, and slow blocks last.
Furthermore, our interpretation of a reduced primacy bias in the first sequence in the autism group was further supported with model simulations. We used a reinforcement learning model which had a regular learning rate and a learning rate decaying over time. A parameter called primacy bias determined how much the model was influenced by these two types of learning rates. Simulations showed that this model could accurately predict the MMN pattern in the TD group. Crucially, by changing only the primacy bias parameter, the otherwise identical model accurately simulated the MMN pattern in the autism group. These simulations support our conclusion that a reduced influence of initial information can explain the MMN pattern in the autism group.
Finally, interindividual difference analyses hinted at a correlation between our effects of interest and autism symptomatology in the autism group. While we have to interpret these effects with caution, this does support the idea that faster model updating is related to autistic characteristics. Specifically, we observed marginally significant correlations with the ADOS-2 total scores only, and not with the questionnaires AQ or SRS-A. A possible explanation for this discrepancy might lie in the fact that the AQ and SRS-A are self-report questionnaires whereas ADOS-2 is an observational instrument.
Here, we focused on the influence of initial information on the course of MMN amplitudes in autism. In contrast, earlier studies compared overall MMN amplitudes between autistic and control participants, which resulted in mixed findings of reduced amplitudes (Gomot et al., 2002; Dunn et al., 2008; Kujala et al., 2010; Andersson et al., 2013; Ludlow et al., 2014; Abdeltawwab and Baz, 2015; Vlaskamp et al., 2017; Ruiz-Martínez et al., 2020; for a meta-analysis, see Schwartz et al., 2018; Chen et al., 2020); as well as increased amplitudes (Ferri et al., 2003; Lepistö et al., 2005, 2006, 2008; Grisoni et al., 2019), and some findings of no differences in amplitude (Kujala et al., 2007a; Chien et al., 2018; Goris et al., 2018; Hudac et al., 2018). While the current study is in line with the latter finding, we want to emphasize that the crucial difference between groups was a primacy bias in MMN amplitudes in the TD group which was absent in the autism group.
Together, these results extend previous findings of reduced context sensitivity of prediction errors in autism (Palmer et al., 2015; Skewes et al., 2015; Lawson et al., 2017; Goris et al., 2018) to not only support the hypothesis of a context-insensitive but also generally higher weighting of sensory prediction errors in autism (Lawson et al., 2014; Van de Cruys et al., 2014). In other words, our findings confirm the idea that new information is incorporated faster in sensory models in autism, and older information is abandoned faster, and demonstrate this at a very early level of sensory processing. Importantly, this weaker primacy bias found in sensory perception might not generalize to higher levels of information processing. On the contrary, some studies have suggested a stronger primacy bias in autism during decision-making (South et al., 2012; D'Cruz et al., 2013; I. Lieder et al., 2019; Goris et al., 2021). Possibly, people with autism overcompensate for faster model updating during early sensory processing, by being more conservative on higher levels of information processing. Future studies should systematically compare this seemingly divergent pattern of aberrant weighting of prediction errors during early sensory perception versus behavior at higher levels of information processing (for example by investigating different types of primacy biases, e.g. Kotchoubey, 2014; Bronfman et al., 2016; Oberfeld et al., 2018).
In conclusion, we found that the MMN, an EEG component reflecting early sensory prediction errors, was less modulated by initial models in a group of adults with autism, compared with a control group, showing a weaker primacy bias in sensory perception in autism. This interpretation was further supported with model simulations, which predicted the MMN patterns in both groups and the difference between them, by only changing the extent to which the model is influenced by initial information. These findings confirm the hypothesis of faster model updating, as put forward by predictive coding accounts of autism.
Footnotes
J.G. was supported by FWO–Research Foundation Flanders PhD fellowship. E.D. was supported by FWO–Research Foundation Flanders postdoctoral fellowship. S.B. was supported by European Union's Horizon 2020 research and innovation program ERC Starting grant (Grant agreement 852570). M.B. was supported by Einstein Strategic Professorship (Einstein Foundation Berlin) and Deutsche Forschungsgemeinschaft, Germany's Excellence Strategy (EXC 2002/1 “Science of Intelligence,” Project 390523135).
The authors declare no competing financial interests.
- Correspondence should be addressed to Judith Goris at judith.goris{at}ugent.be