Abstract
The ability to establish associations between visual objects and speech sounds is essential for human reading. Understanding the neural adjustments required for acquisition of these arbitrary audiovisual associations can shed light on fundamental reading mechanisms and help reveal how literacy builds on pre-existing brain circuits. To address these questions, the present longitudinal and cross-sectional MEG studies characterize the temporal and spatial neural correlates of audiovisual syllable congruency in children (age range, 4–9 years; 22 males and 20 females) learning to read. Both studies showed that during the first years of reading instruction children gradually set up audiovisual correspondences between letters and speech sounds, which can be detected within the first 400 ms of a bimodal presentation and recruit the superior portions of the left temporal cortex. These findings suggest that children progressively change the way they treat audiovisual syllables as a function of their reading experience. This reading-specific brain plasticity implies (partial) recruitment of pre-existing brain circuits for audiovisual analysis.
SIGNIFICANCE STATEMENT Linking visual and auditory linguistic representations is the basis for the development of efficient reading, while dysfunctional audiovisual letter processing predicts future reading disorders. Our developmental MEG project included a longitudinal and a cross-sectional study; both studies showed that children's audiovisual brain circuits progressively change as a function of reading experience. They also revealed an exceptional degree of neuroplasticity in audiovisual neural networks, showing that as children develop literacy, the brain progressively adapts so as to better detect new correspondences between letters and speech sounds.
Introduction
Literacy is a relatively recent cognitive achievement in human evolution for which there are no specialized neural circuits already in place. Learning this life-changing skill thus requires considerable modulation of pre-existing brain networks, such as the visual object recognition and spoken language networks (Carreiras et al., 2009; Dehaene et al., 2015). A considerable amount of research on reading-related brain changes has examined this plasticity in either visual and auditory brain circuits (Goswami and Ziegler, 2006; Ziegler and Muneaux, 2007; Dehaene et al., 2010, 2015). However, the core of reading acquisition lies in the interaction between these two modalities. Efficient reading skills crucially depend on the ability to compare and connect visual and auditory representations of letters (Blomert, 2011). The present MEG study focused on these audiovisual processes, testing how they changed as a function of developing reading abilities. We hypothesized that during reading acquisition pre-existing brain circuits for audiovisual processing should become progressively tuned to the arbitrary relationships between letters and speech sounds (Blomert, 2011).
The processing of natural audiovisual associations (e.g., the correspondence between speech and lip movements) has been widely explored in the literature. The effects of audiovisual integration (i.e., the absolute difference between bimodal and unimodal presentations) and audiovisual congruency (i.e., the absolute difference between matching and mismatching bimodal presentations) have mainly been localized in the auditory cortex and the superior temporal cortex (Amedi et al., 2005; Hocking and Price, 2008), with possible left lateralization (Calvert et al., 1998; Calvert, 2001). Research on fluent adult readers has shown that these brain areas seem to be (at least partially) recruited even in processing arbitrary associations between letters and speech sounds, indicating a certain degree of plasticity in audiovisual brain areas during reading acquisition (Amedi et al., 2005; Hocking and Price, 2008; Blomert and Froyen, 2010). Neuroimaging studies comparing matching and mismatching letter–sound pairs reported effects in the superior temporal and auditory cortex (van Atteveldt et al., 2004, 2007; Blau et al., 2008, 2010; Karipidis et al., 2017, 2018), which were often left lateralized and appeared within the first 500 ms of stimulus presentation (Karipidis et al., 2017, 2018; Raij et al., 2000; Xu et al., 2019, 2020; for even earlier effects, see Herdman et al., 2006). Importantly, cross-sectional designs have revealed a relation between these audiovisual effects and reading skills (Blau et al., 2010; Jost et al., 2014; Karipidis et al., 2017, 2018), indicating that cross-modal brain responses are affected by literacy experience. Studies on normal reading acquisition in children seem to suggest that automatic effects of audiovisual letter processing are rare in beginning readers (Xu et al., 2018) and may emerge only after a few years of formal reading instruction under facilitated experimental conditions (e.g., nonsimultaneous bimodal presentations; Froyen et al., 2009). However, the scarce research on these plastic brain changes during development has so far been documented only by means of between-group comparisons. Longitudinal designs overcome the potential limitations—related to the difficulty of establishing perfectly matching groups—in between-group designs. The present MEG study is the first to adopt a longitudinal (alongside a cross-sectional) design to characterize the progressive emergence of audiovisual congruency effects as children learn to read. Matching and mismatching audiovisual syllables were presented to children. We predicted that the audiovisual congruency effect should be localized in the left superior temporal cortex and left auditory cortex and emerge within 500 ms after stimulus onset. We expected this effect to be reading specific and, thus, to correlate with children's reading scores.
Materials and Methods
Participants
Forty-two Basque-Spanish early bilingual children participated in the cross-sectional study (20 females; mean age, 6.3 years; SD, 1.7; age range, 4–9 years). Data from five additional participants were excluded because of poor data quality (n = 4) or the presence of a hearing disorder (n = 1). Participants were divided in two groups (prereaders and readers) based on whether they had already received formal reading instruction (Table 1). Fifteen children from the prereaders group also participated in the longitudinal study, returning for a second MEG recording session. The mean time between session 1 and session 2 was 32 months (SD, 5; age range, 4–8 years; Table 1).
Behavioral description of participants in the cross-sectional and the longitudinal studies
All participants were learning to read in Basque. Basque has a transparent orthography, such that the consistent correspondences between letter and speech sounds are usually mastered within 1 year of reading instruction. Readers' school attendance was regular, and none of them were repeating or had skipped a grade. All participants had normal or corrected-to-normal vision and normal hearing. Their parents reported no neurologic disorders and did not suspect developmental reading problems. The BCBL (Basque Center on Cognition, Brain and Language) ethical committee approved the experiment (following the principles of the Declaration of Helsinki) and all parents (or the tutors) of the children compiled and signed the written informed consent.
Materials and procedure
Thirty consonant–vowel syllables were created using one of six consonants (f, k, l, m, p, t) followed by one of five vowels (a, e, i, o, u) from the Basque alphabet. We used syllables rather than single letters to make the stimuli more ecological. Basque children learn to name Basque letters using syllables and the consonant–vowel syllable structure is highly common in the Basque lexicon. We did not expect this choice to affect our results as audiovisual congruency effects have been reported for a wide range of linguistic (e.g., letters, words, ideograms; Amedi et al., 2005; Hocking and Price, 2008; Xu et al., 2019) and nonlinguistic (pictures; Hocking and Price, 2008) stimuli. The syllables were presented four times both in the visual and the auditory modality to create 120 cross-modal pairs. Spoken syllables were recorded by a female voice at 44.1 kHz. The audiovisual correspondence of cross-modal pairs was manipulated to produce 60 matching and 60 mismatching pairs. The mismatching pairs were pseudorandomly selected so that they always differed in the initial consonant while sharing the final vowel. Sixteen cross-modal syllable pairs were added for a target detection task. They contained the image of a cat in between the letters in the visual presentation and/or the sound of a cat meowing in between the letter sounds in the auditory presentation.
During the experimental trial, the visual stimulus (written syllable) was first presented at the center of the screen. After a 1 s, the auditory stimulus (spoken syllable) was also presented, while the written syllable remained displayed on the screen. The visual and auditory stimuli offsets coincided and the interstimulus interval was 1000 ms (Fig. 1). The onsets of the visual and auditory syllable presentations were shifted to create a facilitated experimental situation where it was more likely to observe early audiovisual congruency effects (Froyen et al., 2009). Moreover, this temporal sequence better reflected children's everyday experience, such as listening to stories read aloud, where they hear language after seeing it in print. Auditory stimuli were presented between 70 and 80 dB through plastic tubes and silicon earpieces (mean duration, 700 ms; SD, 95). The task consisted of pressing a button whenever the current stimulus corresponded to a cat either in the visual or in the auditory modality. Stimuli were randomized across participants. The recording session lasted ∼10 min.
Schematic representation of an experimental trial.
MEG data recording and preprocessing
MEG data were recorded in a magnetically shielded room (Maxshieldł, Elekta) using an Elekta Neuromag MEG device (including 102 sensors with two planar gradiometers and one magnetometer each). MEG recordings were acquired continuously with children in a sitting position, with a bandpass filter at 0.03−330 Hz and a sampling rate of 1 kHz. Head position inside the helmet was continuously monitored using five head position indicator coils. The location of each coil relative to the anatomic fiducials (nasion, and left and right preauricular points) was defined with a 3D digitizer (Fastrak, Polhemus). This procedure is critical for head movement compensation during the data recording session. In addition, ∼200 head surface points were digitized and later used to spatially align the MEG sensors with an age-based pediatric T1 template (Fonov et al., 2011).
Eye movements were monitored with bipolar vertical electro-oculogram (VEOG) and horizontal electro-oculogram (HEOG). MEG data were individually corrected for head movements and subjected to noise reduction using MaxFilter (version 2.2.15; Neuromag, Elekta) and the temporally extended signal space separation method (Taulu and Kajola, 2005; Taulu and Hari, 2009). On average, 10 bad channels were automatically identified using Xscan (Neuromag, Elekta). Bad channels were substituted with interpolated values. There was no difference between the number of channels interpolated between readers (10.2; SD, 2.2) and prereaders (9.1; SD, 2.2; t < 1), or between session 1 (10.1; SD, 2.3) and session 2 (10.2; SD, 3.3; t < 1).
Subsequent analyses were performed using MATLAB R2014 (MathWorks) and the Fieldtrip toolbox (Oostenveld et al., 2011). MEG epochs of 2.5 s were obtained, including 1.5 s before and 1.0 s after the auditory presentation onset. High-frequency muscle artifacts (110–140 Hz) were automatically rejected: average z-values over sensors and time points in each trial were calculated, and trials exceeding the threshold of a z score equal to 30 were removed. To suppress eye movement artifacts, 70 independent components were identified by applying independent component analysis (Jung et al., 2000) to the MEG data. Independent components corresponding to ocular artifacts were identified and removed based on the correlation values between each component and the VEOG/HEOG channels (rejected components range, 0–2).
Finally, MEG epochs were visually inspected to discard any remaining artifacts. On average, 28.1% (SD, 13.1) of trials were rejected [cross-sectional study: 26.7% (SD, 11.6); longitudinal study: 30.0% (SD, 14.9)], with no significant difference between conditions (F values < 3; p values > 0.05) or groups (F values < 5; p values > 0.05).
MEG experimental design and statistical analysis
Sensor-level event-related fields.
The artifact-free MEG data were lowpass filtered at 35 Hz. Trials were grouped together for each condition and then averaged to obtain the event-related fields (ERFs). ERFs were quantified as the absolute amplitude of the 102 orthogonal planar gradiometer pairs by computing the square root of the sum of squares of the amplitudes of the two gradiometers in each pair. A baseline correction for the data preceding the stimulus by 500 ms was performed.
In both the cross-sectional and longitudinal studies, the ERFs for the match and mismatch conditions of prereaders and readers were statistically compared using a nonparametric cluster-based permutation test (Maris and Oostenveld, 2007). Specifically, t statistics were computed for each sensor (combined gradiometers) and time point during the 0–1000 ms time window, and a clustering algorithm formed groups of channels over time points based on these tests. The neighborhood definition was based on the template for combined gradiometers of the Neuromag-306 provided by the toolbox. In order for a data point to become part of a cluster, a threshold of p = 0.05 was used (based on a two-tailed dependent t test, using probability correction). The sum of the t statistics in a sensor group was then used as a cluster-level statistic (e.g., the maxsum option in Fieldtrip), which was then tested with a randomization test using 1000 runs. Moreover, we used a two-tailed t test to perform a between-group comparison of the audiovisual congruency effects (ERF differences between mismatch and match conditions) in the cross-sectional and the longitudinal study. Finally, partial correlations were calculated to evaluate the relationship between the magnitude of the audiovisual congruency effect and reading performance after correcting for age, vocabulary size, and nonverbal intelligence.
Source-level ERFs.
Using MRiLab (version 1.7.25; Neuromag, Elekta), the digitized points from the Fastrak digitizer (Polhemus) were coregistered to the skin surface obtained from an age-compatible T1 template (Fonov et al., 2011). The T1 template was segmented into scalp, skull, and brain components using the segmentation algorithms implemented in Freesurfer (Dale et al., 1999). The source space was defined as a regular 3D grid with a 5 mm resolution, and the lead fields were performed using a realistic three-shell model. Both planar gradiometers and magnetometers were used for inverse modeling. Whole-brain source activity was estimated using the linearly constrained minimum variance (LCMV) beamformer approach (Van Veen et al., 1997). For each condition, LCMV beamformer was computed on the evoked data in the −400 to 0 prestimulus and in the 350–750 ms poststimulus time intervals. This poststimulus interval was chosen because it contained the audiovisual congruency effects at the sensor level. Statistical significance was assessed by a paired t test (from SPM software) comparing mean amplitudes in the poststimulus and the prestimulus interval (SPM).
Results
Participants were able to correctly identify the target stimuli (cross-sectional d′, 1.870; longitudinal d′, 2.077), with no differences across groups (cross-sectional: t(40) = 1.114, p = 0.272; longitudinal: t(14) = 1.872, p = 0.082).
For the cross-sectional study (Fig. 2A), cluster-based permutations on the ERF responses showed an audiovisual congruency effect (p = 0.001; difference between mismatch and match condition) only for readers in a 350–790 ms time window following the auditory syllable onset over left temporal sensors (Fig. 2A). The magnitude of the audiovisual congruency effect differed between readers and prereaders (p = 0.005). This difference was because of the suppressed amplitude of the match condition in readers compared with prereaders (match condition, p = 0.021; mismatch condition, p = 0.105; Fig. 2B and 2C).
A, ERFs for the cross-sectional study. A, Grand average ERF responses to spoken syllables for the match (blue) and the mismatch (red) condition in prereaders (left) and readers (right). Shaded edges represent ±1 SE. ERF waveform averages were calculated based on the group of left sensors displayed on the map in the top left corner. The top maps represent the topographic distribution of the audiovisual congruency effect (calculated by subtracting the match from the mismatch condition) within the time window when the effect reached its maximum. The topographic maps at the bottom show the spatial distribution of the statistically significant cluster in the same time window (yellow color scale indexes the magnitude of t values that passed the statistical threshold of 0.05). B, Topographic maps of the difference between readers and prereaders. C, Spatial distribution of the statistically significant cluster when comparing readers and prereaders (yellow color scale indexes significant t values magnitude).
Similarly, for the longitudinal study (Fig. 3A), we observed an audiovisual congruency effect (p = 0.017) in a 390–563 ms time window following the auditory syllable onset over left temporal sensors (Fig. 3A). The magnitude of the audiovisual congruency effect differed between sessions (session 1 vs session 2, p = 0.038). Again, this difference was because of the suppressed amplitude of the match condition in the readers (session 2) compared with the prereaders (session 1: match condition, p = 0.021; mismatch condition, p = 0.627; Fig. 3B and 3C).
A, ERFs for the longitudinal study. Grand average ERF responses to spoken syllables for the match (blue) and the mismatch (red) conditions in session 1 and session 2. Shaded edges represent ±1 SE. ERF waveform averages were calculated based on the group of left sensors displayed on the map in the top left corner. The top maps represent the topographic distribution of the audiovisual congruency effect (calculated by subtracting the match from the mismatch condition) within the time window when the effect reached its maximum. The topographic maps at the bottom show the spatial distribution of the statistically significant cluster in the same time window (yellow color scale indexes the magnitude of t values that passed the statistical threshold of 0.05). B, Topographic maps of the difference between session 1 and session 2. C, Spatial distribution of the statistically significant cluster when comparing session 1 and session 2 (yellow color scale indexes significant t values magnitude).
The ERF effects observed at the sensor level were source reconstructed in the 350–750 ms time window. In both the cross-sectional and longitudinal studies, the congruency effect (p < 0.05) emerged in the posterior part of the left superior temporal cortex (Fig. 4).
Spatial localization of the audiovisual congruency effect for readers of the cross-sectional and the longitudinal study. The final plot shows the conjunction of the two effects. Paired t test comparing the mean source activity in the prestimulus and poststimulus intervals were calculated. Different color intensity indexes significant t values.
The size of the audiovisual congruency effect negatively correlated with reading errors and reading speed measures after correcting for age, nonverbal intelligence, and vocabulary size (syllable reading times: r = −0.31, p = 0.031; number of errors per second while reading Basque words: r = −0.36, p = 0.014; number of errors per second while reading Basque pseudowords: r = −0.23, p = 0.090; Fig. 5).
Correlation between the residuals of the audiovisual congruency effect (AVCE) and the residuals of reading scores (after correction for age, nonverbal intelligence, and vocabulary size). From left to right: syllable reading times, number of errors per second while reading Basque words, number of errors per second while reading Basque pseudowords. All readers are displayed in the scatterplots (n = 37; dark gray, cross-sectional study; light gray, longitudinal study).
Discussion
The capacity to create strong associations between speech sounds and written representations is a key skill for reading. Audiovisual letter and audiovisual symbol processing predict future reading fluency (Horbach et al., 2015, 2018; Karipidis et al., 2018) and are often impaired in dyslexia (Fox, 1994; Vellutino et al., 2004; Froyen et al., 2011; Richlan, 2019). Understanding the developmental changes involved in letter-to-speech sound processing can shed light on the pivotal mechanisms of reading and can point to possible sources of reading disorders. With this aim, the present study investigated how audiovisual syllable analysis changed as a function of reading acquisition. The results showed a high degree of plasticity in neural responses to audiovisual syllable congruency, which was related to reading acquisition (as shown by partial correlations with reading performance). This neural adjustment was mainly localized in the left superior temporal cortex, which is in line with previous findings (Raij et al., 2000; Blau et al., 2008, 2010; Karipidis et al., 2017, 2018; Xu et al., 2019, 2020). Importantly, this brain area is not exclusively involved in the processing of letter–speech sound correspondences, but is also sensitive to less arbitrary audiovisual associations available before reading acquisition (Calvert et al., 1998; Calvert, 2001; Amedi et al., 2005). This broad sensitivity is compatible with the idea that we do not have evolutionarily specialized circuits for reading, and literacy must build on pre-existing brain networks (Dehaene et al., 2010, 2015). In line with this hypothesis, previous findings have shown reading-related adjustment of naturally evolved brain mechanisms for visual and auditory processing (Goswami and Ziegler, 2006; Ziegler and Muneaux, 2007; Dehaene et al., 2010, 2015). The present findings extend this claim, suggesting that reading experience can also have an impact on naturally evolved brain mechanisms for audiovisual processing (Blomert, 2011).
The direction of the audiovisual congruency effect is also informative. Past research reveals considerable inconsistency: some studies have shown stronger responses for matching conditions; others report the opposite pattern (Table 2). Although it remains unclear what drives the direction of the effect (for some proposals, see Hocking and Price, 2008; Holloway et al., 2015; Plewko et al., 2018; Wang et al., 2020), we note that ∼70% of the studies reporting stronger matching responses are fMRI studies. The reverse pattern has been more frequently observed in electrophysiology and with experimental designs that include nonsimultaneous audiovisual presentations. This could indicate that temporal aspects of experimental design may affect the direction of the effect. The present MEG studies fully align with these trends found in the literature.
Quick summary of the direction of audiovisual congruency effects previously reported in the literature
In both the longitudinal and cross-sectional study, we observed progressive suppression of the audiovisual matching response as a function of reading skills. Given that the congruency effect was found in auditory areas and the lack of modulation in the mismatch condition, it is unlikely that attention mechanisms accounted for this effect. This pattern is more likely the result of cross-modal integration since audiovisual correspondences can only be detected given successful interaction between two unimodal inputs. However, not all brain areas showing a congruency effect are necessarily the source of integrative operations (van Atteveldt et al., 2004, 2007; van Atteveldt and Ansari, 2014). Neuroimaging studies on adults comparing unimodal and bimodal letters proposed a finer functional distinction within subareas of the left superior temporal cortex. According to this view, the superior temporal sulcus is the neural hub for audiovisual convergence and integration, which sends feedback to superior auditory areas signaling letter–sound congruency (van Atteveldt et al., 2004, 2007). This functional distinction is further confirmed by cytoarchitectonic studies in human and nonhuman primates, which have shown a difference in the cellular structure of dorsolateral and ventromedial temporal regions (Ding et al., 2009; Insausti, 2013; Zachlod et al., 2020). The reduced response of the superior temporal cortex to matching audiovisual syllables might reflect the sharpening of neuronal tuning (i.e., responses to overlearned audiovisual associations are suppressed; Hurlbert, 2000); cross-modal repetition suppression (Henson, 2003); or neural adaptation (Grill-Spector and Malach, 2004).
The present MEG results also support the idea that written letters systematically modulate children's responses to speech sounds in the left superior temporal cortex (Herdman et al., 2006; van Atteveldt et al., 2007; Froyen et al., 2008, 2009). Our longitudinal findings suggest that this effect is already present after a few months of formal reading instruction. A longer training period might be needed to reach a high degree of automaticity (and a shorter time window for audiovisual integration; Laasonen et al., 2000, 2002; Froyen et al., 2009). In the present study, the long stimulus-onset asynchrony (SOA) between the visual and auditory onsets, together with the relatively late latency of our audiovisual congruency effect, point to a low degree of automaticity. This is in line with a slow developmental trajectory for automatic letter–speech integration that extends beyond the first years of reading instruction (Froyen et al., 2009).
While the superior temporal cortex became progressively more sensitive to audiovisual letter congruency, other reading-related brain areas, such as the visual word form area (VWFA), did not show similar tuning. The shifted time onset between the visual and the auditory presentation might have reduced chances to observe an audiovisual congruency effect in ventral occipitotemporal areas. It is possible that, after early activation during the visual presentation, there was no additional VWFA recruitment with spoken syllables. More research on simultaneous and nonsimultaneous audiovisual presentations is needed to clarify this point. The lack of occipitotemporal effects might also relate to levels of reading automaticity, with the VWFA becoming more responsive to auditory/audiovisual stimuli as reading automaticity increases (Yoncheva et al., 2010; Monzalvo and Dehaene-Lambertz, 2013). The present findings suggest that at low levels of automaticity the left superior temporal cortex plays a crucial role in establishing cross-modal correspondences between letters and speech sounds. The VWFA does not seem to be as crucial at this stage but might become more relevant after several years of reading instruction (Froyen et al., 2009). These findings are in line with the idea that entrenched audiovisual brain networks represent an essential prerequisite for reading development that precedes the functional tuning of the VWFA (Blomert, 2011).
Previous research has reported a lack of occipitotemporal response during audiovisual processing (van Atteveldt et al., 2004; Karipidis et al., 2018), leading to the general claim that audiovisual congruency effects are more often observed in auditory than visual areas (van Atteveldt et al., 2004; Blomert and Froyen, 2010). However, such effects differ from those associated with the neural network for audiovisual speech, which requires a stronger involvement of visual areas (Calvert et al., 1998; Calvert, 2001). The source of this discrepancy might be related to the different nature of the audiovisual associations in question. While in audiovisual speech the visual component (i.e., lip movements) occurs simultaneously with speech input across the life span, the associations between letters and sounds are arbitrary and do not always occur simultaneously. Thus, although there is partial recycling of brain areas naturally evolved for audiovisual analysis, letter–sound associations maintain a certain degree of specificity (Blomert and Froyen, 2010).
We also found no effects in parietal areas, such as the supramarginal and angular gyri, which are generally thought to be involved in access to phonological representations of text (Pugh et al., 2000; Booth et al., 2004; Schlaggar and McCandliss, 2007). This might be because of differences in experimental design: audiovisual effects in parietal areas are more often observed in comparisons of unimodal and bimodal linguistic stimuli than in comparisons of matching and mismatching audiovisual conditions (Xu et al., 2018, 2019, 2020). These parietal areas may be more involved in audiovisual letter integration than in subsequent feedback to sensory brain areas.
Finally, although our participants were early bilinguals, the present results are compatible with those reported in monolinguals (Herdman et al., 2006; Hocking and Price, 2008; Karipidis et al., 2017, 2018). In addition, both writing systems learned by the children in this study (Spanish and Basque) were highly transparent and required similar learning strategies. Greater differences have been reported for late bilinguals (Bidelman and Heath, 2019a,b). Additional research is needed to understand to what extent neural correlates of audiovisual analysis can be generalized to diverse linguistic profiles.
In conclusion, the present study sheds light on the developmental changes of audiovisual syllable processing. Within the first months of reading instruction, children progressively set up letter–sound associations, which can be detected within the first 400 ms of bimodal presentation and recruit the left superior temporal cortex. This reading-dependent brain tuning supports the idea that general mechanisms of audiovisual processing are applied (at least partially) to new arbitrary correspondences between letters and speech sounds.
Footnotes
This project received funding from the European Union's Horizon 2020 research and innovation program under Marie Sklodowska-Curie Grant Agreement No. 837228 (H2020-MSCA-IF-2018-837228-ENGRAVING). The project was also funded by the Spanish Ministry of Economy, Industry and Competitiveness (Grant PSI2017-82941-P), the Basque Government through the BERC 2018-2021 Program, and the Agencia Estatal de Investigación through BCBL (Basque Center on Cognition, Brain and Language) Severo Ochoa excellence accreditation SEV-2015-0490.
The authors declare no competing financial interests.
- Correspondence should be addressed to Sendy Caffarra at caffarra{at}stanford.edu