Abstract
Reading sentences involves a distributed network of brain regions acting in concert surrounding the left sylvian fissure. The mechanisms of neural communication underlying the extraction and integration of verbal information across subcomponents of this reading network are still largely unknown. We recorded intracranial EEG activity in 12 epileptic human patients performing natural sentence reading and analyzed long-range corticocortical interactions between local neural activations. During a simple task contrasting semantic, phonological, and purely visual processes, we found process-specific neural activity elicited at the single-trial level, characterized by energy increases in a broad gamma band (40–150 Hz). Correlation analysis between task-induced gamma-band activations revealed a selective fragmentation of the network into specialized subnetworks supporting sentence-level semantic analysis and phonological processing. We extend the implications of our results beyond reading, to propose that gamma-band amplitude correlations might constitute a fundamental mechanism for large-scale neural integration during high-level cognition.
Introduction
Despite the rapid accumulation of evidence from functional neuroimaging, the neural processes that allow us to read a sentence like this one remain largely unknown. The brain regions that support reading have been successfully identified by fMRI and PET studies over the last 15 years (Petersen et al., 1988; Price, 2000; Binder et al., 2003; Jobard et al., 2003; Démonet et al., 2005; Dehaene and Cohen, 2011; Price and Devlin, 2011), and functional connectivity studies have revealed correlated metabolic demands between those regions, suggesting that they behave collectively as a network—the “reading network” (Bullmore et al., 1996; Bokde et al., 2001; Homae et al., 2003; Mechelli et al., 2005). However, it is difficult to infer from purely metabolic measures the fast neural mechanisms that support network interactions during reading.
Following research on network dynamics initiated with EEG and MEG (Kutas and Federmeier, 2000; Pylkkänen and Marantz, 2003; Hald et al., 2006; Salmelin and Kujala, 2006; Kujala et al., 2008; McDonald et al., 2010; Barca et al., 2011), but at the finer spatial scale of intracranial EEG (iEEG), our intention was to reveal neural interactions between individual components of the reading network, regarding two of its most characteristic subprocesses: sentence-level semantic analysis and phonological processing.
Sentence-level semantic analysis involves several regions involved in speech perception, including the bilateral middle temporal gyrus, the angular gyrus, and often Broca's area (Mashal et al., 2009; Obleser and Kotz, 2009; Rogalsky and Hickok, 2009; Ye and Zhou, 2009; Price, 2010). Phonological processing—the production of an auditory representation of linguistic content—requires several subprocesses, only a fraction of which are specific to reading, such as grapheme-to-phoneme conversion in the left ventral temporal occipital cortex (Cohen et al., 2000). It involves also less specific processes, such as working memory, in conditions that emphasize the progressive formation of an auditory word form from individual syllables (Juphard et al., 2011), in strong interaction with subvocal articulation. Those processes involve mostly lower precentral and postcentral regions from Broca to the supramarginalis gyrus, which are also generally active during speech production, for instance during picture-naming tasks (Sinai et al., 2005).
We used iEEG to clarify the interactions between those regions during a simple sentence reading task contrasting visual, phonological, and semantic analysis. We focused our analysis on gamma-band amplitude (GBA) fluctuations (40–150 Hz), which correlate with the BOLD signal (Logothetis et al., 2001; Kayser et al., 2004; Mukamel et al., 2005; Lachaux et al., 2007; Ojemann et al., 2010) and with population-level firing rate (Manning et al., 2009; Ray and Maunsell, 2011). These results suggest two testable predictions: first, that GBA should increase in all major nodes of the reading network during sentence reading, and second, that GBA time fluctuations should be correlated between those regions, in a task-dependent fashion, as do BOLD fluctuations (Richardson et al., 2011).
Our study confirmed those predictions and revealed that semantic analysis increases GBA amplitude correlation between inferior frontal and middle temporal sites, while phonological analysis strengthens the coupling between frontal, prefrontal, anterior parietal, and inferior temporal sites.
Materials and Methods
Patients and electrodes implantation.
Twelve patients, candidates for drug-resistant partial epilepsy surgery, participated in this study (all female; average age, 30 years). All participants provided written informed consent and the experimental procedures were approved by the Institutional Review Board and by the National French Science Ethical Committee (Comité Consultatif de Protection des Personnes dans la Recherche Biomédicale). Before the experiment began, patients were screened with standard clinical neuropsychological tests to evaluate their ability to perform the task.
As the location of the epileptic focus could not be identified using noninvasive methods, the patients underwent intracranial EEG recordings by means of stereotactically implanted multilead depth electrodes [stereotactic EEG (SEEG)] (Kahane et al., 2004; Jerbi et al., 2009b). Selection of sites to implant was entirely based on clinical purposes, with no reference to the present experimental protocol. The 12 participants were native French speakers with normal or corrected-to-normal vision. Patients performed the task 4 d after the implantation of the electrodes.
Each patient was implanted with semirigid electrodes (0.8 mm diameter), consisting of a linear array of 10–15 recording sites (with an equal intersite spacing of 3.5 mm, center to center) (Dixi). Electrodes were inserted perpendicular to the sagittal plane (see Fig. 1D), therefore recording from both middle and lateral cortical areas, including sulcal cortex. The spatial resolution of such intracerebral EEG recordings is on the order of the distance between consecutive recording sites, that is, 3.5 mm (Lachaux et al., 2003; Jerbi et al., 2009b). Electrode contacts were identified on each individual stereotactic scheme and then anatomically localized using the proportional atlas of Talairach and Tournoux (1988). In addition, computer-assisted matching of a postimplantation computerized tomography scan with a preimplantation 3-D MRI provided a direct visualization of the electrode contacts with respect to the brain anatomy of each patient (Activis).
Recordings.
Experimental data were recorded extraoperatively according to our routine procedure (Kahane et al., 2004). Intracranial recordings were conducted using an audiovideo EEG monitoring system (Micromed) that allowed the simultaneous recording of 128 SEEG channels sampled at 1024 Hz (0.1–250 Hz bandwidth) during the experimental paradigm. One of the contact sites in the white matter was chosen as a reference; however, all signals were re-referenced to their nearest neighbor on the same electrode, 3.5 mm away before analysis (bipolar montage). Recording sites showing clear epileptiform activities were excluded from the analysis. Among the remaining sites, monopolar and bipolar data were systematically inspected, both raw and high-pass filtered (>15 Hz), and any trial showing epileptiform activity (such as spikes) in any of those traces was discarded. The cortical regions analyzed in this study were all located outside the seizure onset zone, in each patient (Table 1), and have been functionally mapped with electrical cortical stimulation (Table 2).
Eye movements were recorded with two external electro-oculogram (EOG) channels.
Experimental paradigm.
The experiment consisted of three conditions: in the first condition (called, in short, “semantic” or SEM, because it involved semantic, as well as syntactic, processing at sentence level; see Fig. 1B, left panel), participants were presented with 80 short one-line sentences taken from a children book story [one sentence per trial (e.g., in French, “LE ROI A QUITTE SON CHATEAU”; in English, “THE KING HAS LEFT HIS CASTLE”)]. Sentences were in French and had between 7 and 13 words. However, in 20% of trials (16 trials), the sentence was from a completely unrelated story, found in a sport magazine (e.g., in French, “SON JEU DE REVERS EST VRAIMENT BON”; in English, “HIS BACKHAND IS REALLY GOOD”). The task was to identify for each sentence whether it belonged to the main story or not. General presentation structure was the same for each trial (see Fig. 1A): First, a fixation cross appeared for 2000 ms in the left end of the screen, instructing the participant to shift her gaze to that position. The next screen would display the sentence, which the participant had to read at her own pace until her gaze reached a fixation cross at the right end of the screen. At that time, the participant had to press a button (right index finger) to display the response panel, asking whether the sentence belonged to the main story (“left index button press”) or not (“right index button press”). That second button press triggered the next trial. Note the design of experiment made it possible to calibrate eye movements in each trial, since participants were instructed to look at a left and a right fixation cross.
In the second condition (called “phonological” or PHO because it involved explicit grapho-phonological conversion, as well as working memory; see Fig. 1B, middle panel), words were replaced with groups of repeated letters (e.g., “PPP HHHH OOO NNNN EEE”), and the participants had to shift their gaze from group to group with the explicit instruction to progressively silently read the letter string formed by the letters of the successive groups (e.g., “PH..O..NE”). This was a lexical decision task on whether the formed word represented or not a real word (64 of 100 trials contained a word in each block). Nonwords were all pronounceable.
In the third condition [visual (VIS); see Fig. 1B, right panel], words were replaced by strings of alphanumeric signs, some of which contained a digit (e.g., “HDF3Z NGTD SZY2G FZTS JRGBD PFHD”). Participants had read them shifting their gaze from group to group and from left to right. The task was to decide whether there were two or three groups comprising digits (64 trials with two digits and 16 trials with three digits in each block). On average, the number of character strings displayed in each trial, in the visual or the phonological conditions, matched the sentence length in the semantic condition.
Three patients (P4, P6, and P12) also performed a separate single-word processing task. The semantic condition was a classic animacy decision: in each trial, a single word was displayed for 3 s, after a 1 s fixation period (central cross). Participants had to press a button to indicate whether the word was a living entity or not. In the phonological condition, stimuli were two- or three-syllable pronounceable pseudowords that patients had to read silently to specify the number of syllables with a response button.
Each condition was presented in blocks of 40 trials each in the following order: phonological, visual, semantic, visual, phonological, and semantic.
Stimuli were shown in white on a black background computer screen, located 70 cm in front of the participant. The experiment was performed using Presentation Package (version 0.70; Neurobehavioral Systems). Visual angle between peripheral left and right fixation crosses was 14° (−7 to 7°). Precise visual extent of sentences varied with sentence length, between 12 and 13°.
Data analysis.
Intracranial EEG data were analyzed with respect to the three experimental conditions involving free saccadic eye movements. Saccades onset and offset were identified from EOG data segments recorded during sentence reading (semantic) or during left-to-right gaze shifts across letter strings (phonological and visual). Identification was semiautomatic: a custom MATLAB (The Mathworks) script provided first-pass automatic detection of saccade onsets and offsets. The script used a moving window of 100 ms: in every window, EOG data were normalized (Z-score) relative to the mean and SD of the EOG signal measured in the first half of that window (the first 50 ms). Fixations (saccade offsets) were defined at samples with a local maximum and an absolute value >5. Saccade onset was defined as the first sample of the EOG slope leading to fixation. All saccades were then inspected manually to eliminate false-positives of the algorithm. In addition, careful visual inspection of the saccade-related activity was performed to exclude recording sites with artifactual gamma activity caused by extraocular eye muscle movements. We have previously shown how reading saccades can produce such artifacts, even in iEEG (Jerbi et al., 2009a).
Intracranial EEG data were analyzed in the time-frequency domain in the time interval from −1000 to 5000 ms, locked to array onset. For each epoch, bipolar derivations computed between adjacent electrode contacts were analyzed in the time-frequency domain. Continuous SEEG signals were first bandpass filtered in multiple successive 10-Hz-wide frequency bands (e.g., 810 bands from 50–70 to 130–150 Hz). Next, for each bandpass-filtered signal, we computed the envelope using standard Hilbert transform. The obtained envelope has a time resolution of 64 Hz (time bins every 15,625 ms). Again for each band, this envelope signal (i.e., time-varying amplitude) was divided by its mean across the entire recording session and multiplied by 100. This yields instantaneous envelope values expressed in percentage (%) of the mean. Finally, the envelope signals computed for each consecutive frequency bands (e.g., 10 bands of 10 Hz intervals between 50 and 150 Hz) were averaged together, to provide one single time series (the high gamma-band envelope). By construction, the mean value of that time series across the recording session is equal to 100.
All statistical evaluations were done on band-limited envelopes (i.e., amplitude profiles in time). A first assessment of neural response was obtained through the comparison of poststimulus activity to its average baseline power level (−500 to −100 ms) with a Wilcoxon signed-rank test for matched pairs, with a Bonferroni correction for multiple comparisons for number of time samples and electrodes. We applied this test separately for each condition across all electrodes and all time points between 0 and 5000 ms poststimulus for each patient individually.
Comparison between tasks was done between gamma-band envelopes via a Kruskal–Wallis nonparametric test, thus avoiding any assumption about the data distribution (Kruskal and Wallis, 1952). Specific effects were further studied by means of a post hoc Tukey–Kramer test, with a Bonferroni correction for multiple comparisons (samples by electrodes).
We also tested whether gamma-band activity was modulated by the timing of fixations. Since each fixation brings novel information into the reading network, one might expect that part of that network would react to each saccade with a systematic gamma-band energy increase, time-locked to fixation. For that purpose, data were divided into 1 s epochs centered on each fixation. We then performed a Kruskal–Wallis test comparing gamma-band amplitude across eight non-overlapping windows of 50 ms duration covering a 0:400 ms interval (relative to fixation). The test was corrected for multiple comparisons (number of channels by number of conditions). Significant p values indicate an effect of window latency on the gamma amplitude within the 0:400 ms interval (i.e., an effect of fixation on the gamma amplitude over that short timescale).
Amplitude correlations.
Based on the average reaction times measured in the three conditions, we chose to compute correlation coefficient in a 3 s window starting at the stimulus onset, to restrict the analysis to the actual reading or visual processes.
In each condition, we measured for each trial the correlation coefficient between the gamma envelopes recorded in pairs of recording sites. This distribution of correlation coefficients was then compared through a Wilcoxon nonparametric test with the distribution of coefficients obtained when shuffling trials from one of the two sites (e.g., computing the correlation between trial i from site 1 and trial j from site 2, i ≠ j). This surrogate distribution included all possible shuffling (i.e., for each individual trial of the first electrode, the correlation between that trial and all other trials from the second electrode). For each patient, the threshold of significance was set to 0.05 divided by the total number of channel pairs tested in this patient (Bonferroni's correction for multiple comparisons). Channel pairs with a significant correlation coefficient in any of the three conditions were then passed through a Kruskal-Wallis procedure comparing the coefficients measured in the three conditions, to test for task sensitivity with a post hoc Tukey–Kramer procedure.
Prediction value estimate.
The prediction value estimate is computed as follows: considering a sample X that does not belong to group P1 if it is >95% of P1 (risk 5%), N2 is the number of elements of P2 that do not belong to P1. This percentage corresponds to the following: 100 − 100 * (N2/P2). If all elements of P2 pass the test, then the score should be 0 (dissociation index). All reported prediction values estimates are thus associated to a p value of p < 0.05.
Test for volume conduction effects.
To evaluate the contribution of proximity on measured correlation values, we evaluated the correlation between sites as a function of the distance between them. Regular spacing between consecutive probes on SEEG electrodes provides a convenient way to measure the fall-off of correlation with distance between sites. We selected eight electrodes sampling frontal, temporal, and parietal lobes in six patients, and estimated the correlation for each pair within each electrode. We tested a total of 675 pairs, separated by a intersite distance ranging from 3.5 to 17.5 mm; the percentage of significant correlation coefficient (comparison with trial-shuffled surrogates, p < 0.05, uncorrected for multiple comparisons) decreased as a function of distance to fall below chance level at 3.5 mm: 98% (180 of 183); 7 mm, 62% (99 of 159); 10.5 mm, 51% (69 of 135); 14 mm, 21% (24 of 111); 17.5 mm, 4% (3 of 187). We concluded that high correlation values between sites separated by >2 cm was unlikely to be attributed to volume conduction.
Results
Task description and behavioral results
The three conditions of the experiment are summarized in Figure 1, A and B: (1) reading a meaningful sentence [semantic condition (SEM)], (2) integrating individual letters into a word or pseudoword while emphasizing mental pronunciation [phonological condition (PHO)], and (3) searching for digits embedded in groups of consonant strings [visuo-orthographic condition (VIS)].
This design was adapted from a classic single-word presentation protocol (Mainy et al., 2008) and revised to mimic oculomotor behavior characteristic of sentence reading. The three conditions are also reminiscent of three major stages of reading acquisition: (1) seeing words as meaningless groups of letters (VIS), (2) deciphering the word letter by letter or syllable by syllable (PHO), and (3) as an expert reader (SEM).
Visual inspection of the stimuli lasted between 4 and 5 s on average, with no significant difference between conditions (VIS, mean, 5066 ± 1971 ms; PHO, 4350 ± 1813 ms; SEM, 3832 ± 1605 ms; Kruskal–Wallis, p > 0.05). To rule out that differences in gamma-band activity across conditions could be due to different oculomotor patterns, we performed a simple quantification of saccadic behavior in all conditions: there were four to five saccades and fixations per trial for a total number of saccades and fixations close to 400, with no significant difference between conditions (VIS, 394 ± 145; PHO, 369 ± 120; SEM, 366 ± 82; Kruskal–Wallis, p > 0.05). There were 80 trials per condition. The mean duration of a fixation was ∼400 ms (VIS: 456 ± 135 ms; median, 388; PHO: 389 ± 83 ms; median, 373; SEM: 358 ± 83 ms; median, 345), but the saccade latency distribution (Fig. 1C) revealed a clear asymmetric shape, with an accumulation of values between 200 and 400 ms (VIS, 56% of all fixations durations; PHO, 62%; SEM, 67%). Fixations were slightly shorter in the SEM than in the VIS condition (Kruskal–Wallis test followed by post hoc Tukey–Kramer, p = 0.043). Overall, we found that oculomotor behavior was, if not identical, at least comparable in the three conditions.
Average behavioral accuracy was 96.35% (±0.66 SEM) correct responses in the semantic condition, 96.56% (±0.51) correct responses in the phonological condition and 95.52% (±0.52) correct responses in the visual condition. A one-way ANOVA showed no significant difference across conditions (F = 0.94; p = 0.4).
High-frequency energy modulations during reading
Cortical responses were defined from task-related energy modulations of iEEG signals between 50 and 150 Hz, which we refer to as “gamma-band responses” (GBRs). This measure has become increasingly popular in iEEG studies because of its high stimulus and task sensitivity (Jacobs and Kahana, 2009; Vidal et al., 2010) and because of its coupling with the BOLD signal (Logothetis et al., 2001; Kayser et al., 2004; Mukamel et al., 2005; Niessing et al., 2005; Lachaux et al., 2007; Ojemann et al., 2010) and population-level neural firing rate (Manning et al., 2009; Whittingstall and Logothetis, 2009; Ray and Maunsell, 2011). Theta-band activity was not analyzed in this study because its frequency range overlapped with the dominant rhythm of reading saccades, so endogenous theta oscillations could not be isolated from neural activity locked to eye movements. GBRs were detected in each participant and for each recording site, by comparing the energy in two poststimulus windows [(0:1000 ms) and (1000:2000 ms) relative to stimulus onset] with a neutral prestimulus baseline during screen fixation (−500:−100 ms) (Wilcoxon's test). After Bonferroni's correction for multiple comparisons, the analysis revealed significant positive responses in 9.9% (129 of 1302) and significant negative responses in 2.9% of the recording sites (38 of 1302) (p < 0.05; corrected for 2604 tests).
The sparseness of the effects was explained by the anatomical selectivity of the reading network. Responses were strong and consistent across participants, clustered in a small number of cortical regions that matched the different nodes of the reading network: left middle temporal gyrus (MTG), left supramarginalis gyrus (SMG), left inferior frontal cortex (IFC), and ventral occipitotemporal cortex (VOTC) (Figs. 2⇓–4). Based on differences in task sensitivity and timing, we further distinguished in the IFC two subregions that reflected preference for either SEM or PHO conditions (Fig. 3). We termed these regions IFC SEM and IFC PHO. Similarly, the VOTC was subdivided into three ROIs, interior (INT), intermediate (MID), and lateral (EXT) (see the next section and Fig. 4). Two other regions, the dorsal lateral prefrontal cortex (DLPFC), and the supplementary motor area (SMA), showed undifferentiated activations in all conditions (data not shown). Negative responses (e.g., energy decreases relative to baseline) were also found in the temporo-parietal junction (TPJ) and the ventral-lateral prefrontal cortex (VLPFC) (data not shown). As previously shown by our group, they coincide with task-unspecific deactivations of the default-mode network (Lachaux et al., 2008; Mainy et al., 2008; Jerbi et al., 2010).
The VOTC was the only region showing gamma-band energy increases time-locked to fixation [P12: VOTC MID (one site) and VOTC EXT (one site); P8: VOTC MID (two sites) and VOTC EXT (one site); P5: VOTC MID (three sites); Kruskal-Wallis comparison testing for an effect of postfixation latency on gamma-band amplitude] (see Materials and Methods). A systematic energy peak was observed between 200 and 400 ms following fixation, consistent with response latencies observed after flashed visual stimuli in those regions.
Comparison of the latency of gamma-band responses after stimulus onset revealed a robust difference between posterior and anterior clusters (Wilcoxon's test of matched pairs vs baseline, FDR corrected for multiple comparisons): responses started earlier, before 200 ms, in VOTC INT and MID (patients: 1, 2, 5, 8, 11, 12) and later in the two PFC clusters, after 500 ms (patients: 1, 2, 5, 6, 7).
Components of the sentence reading network
Phonology in the supramarginal and posterior inferior frontal cortex
The SMG and posterior IFC PHO extends on both sides of the central sulcus reaching Broca's area pars opercularis in its most anterior extension (Fig. 2 for SMG and Fig. 3 for IFC PHO, respectively). Both regions have been associated with phonological processing in most brain imaging studies (Fiez and Petersen, 1998; Gabrieli et al., 1998; Price and Mechelli, 2005; Juphard et al., 2011), and our results are in agreement with those observations: GBA responses were significantly stronger in the PHO than in the VIS condition in all sites (SMG, 13 of 13; IFC PHO, 20 of 20; for all, p < 0.05). Responses in the PHO condition were also stronger than in the SEM condition in most sites (SMG, 13 of 13; IFC PHO, 14 of 20; for all, p < 0.05). These effects are visible in the ROI group average (Fig. 2C,D, left panel) but also in single-trial gamma-band amplitude profiles, organized according to reaction times (Fig. 2C,D, middle panel). Using the amplitude of the gamma-band response as a simple classification criterion, we found that in some SMG and IFC PHO sites, as illustrated in Figure 2B, the dissociation between PHO and VIS responses was complete: 2 s of recordings randomly drawn anywhere from the 30 min experiment were sufficient to determine with a 95% accuracy whether the participant was performing the PHO or the VIS task.
Semantics in the middle temporal and anterior inferior frontal cortex
A global analysis of task preference in active IFC sites confirmed a functional partition in the frontal lobe between phonological and semantic processing (Fig. 3). Overall, the SEM condition activated sites more anterior than the PHO condition; however, the anatomical dissociation between the two clusters was quite subtle and sometimes along the lateral dimension, as illustrated by Figure 3A: a clear-cut change in task sensitivity in the IFC across sites <7 mm apart. Semantic responses were in fact found well beyond Broca's area in the more dorsal lateral prefrontal cortex. In that broad frontal ROI, the majority of sites had stronger responses in the SEM than in the PHO or VIS condition (SEM > PHO, 11 of 18; SEM > VIS, 12 of 18; for all, p < 0.05), and none showed opposite effects. The dissociation between SEM and PHO was not as pronounced as between PHO and VIS in SMG and IFC SEM, but reached 74% of correct classifications (p < 0.05 for all sites). We found even stronger preferences for SEM in the middle temporal gyrus (Fig. 2C,D), in an elongated ROI connecting the temporal pole with the angular gyrus (SEM > PHO, 20 of 24; SEM > VIS, 19 of 24; with a maximum of 72% correct separation between SEM and PHO; for all, p < 0.05). Interestingly, MTG sites were active during semantic analysis of sentences, but not in response to isolated words in three patients that performed a secondary animacy decision task on single words (see Materials and Methods). In patient 4 for instance, MTG sites active in the SEM condition of the sentence reading task showed either no response, or even a slight deactivation while processing the meaning of single, isolated words. Single-trial gamma-band responses in Figure 2E show no energy increase relative to baseline in the semantic condition. This example illustrates the marked difference in neural activity during sentence-level or single-word semantic processing.
Functional heterogeneities in the ventral occipitotemporal cortex
In the VOTC, activation time profiles and task sensitivity clearly distinguished between three distinct clusters along a medial-lateral axis (Fig. 4A). The most medial and lateral ROIs responded preferentially to the PHO and SEM conditions, and were separated, in individual subjects, by an intermediate region showing little or no preference in the fusiform gyrus. On individual MRIs, the most medial ROI corresponded to the border of the lingual gyrus and the collateral sulcus, and the most lateral ROI to a region immediately lateral to the fusiform, in the inferior temporal gyrus (ITG).
Further differences in GBA time courses suggest that the lingual gyrus/collateral sulcus and the ITG do not support the same function: as seen in Figure 4A, lingual GBA responses start with a rapid and transient increases equivalent across all conditions, which are nearly absent in the ITG. The lingual gyrus and the ITG also differed in task sensitivity: lingual GBA responses were stronger in SEM than in the PHO condition during the first 1000 ms of visual inspection, when the integration of the individual letters into a word or a pseudoword (for all, p < 0.05), characteristic of the PHO condition, has not completed yet (SEM > PHO in four of eight sites, with no site showing the opposite effect). The single-trial gamma-band response profiles in Figure 4C show a clear transition from a medial temporal selectivity for the SEM and PHO conditions indifferently to a lateral superiority effect in the PHO condition only. The participant is confronted with groups of identical letters in the PHO conditions (e.g., “PPP”), and with real words in the SEM condition. The superiority of the SEM GBA response might then be interpreted as a preference for word forms in the lingual gyrus. Yet this must be reconciled with the observation that the later part of the PHO GBA response is similar to the SEM response, while visual stimuli remain markedly different. In contrast, the ITG GBA response showed a clear preference for the PHO condition, resembling the selectivity of the SMG and posterior IFC (PHO > VIS in 20 of 20 sites, PHO > SEM in 14 of 20 sites; with a 100% dissociation rate between PHO and VIS). This preference was particularly clear in some ITG sites with serious diminished GBA responses in the SEM condition (Fig. 4B).
Task-dependent correlations in the reading network
We reasoned that neural populations interacting within the reading network should exchange information as they process it, in a parallel fashion, and therefore be active at the same time. Such time coincidence can be directly measured by correlation coefficients between time courses of gamma-band responses induced by the stimulus in pairs of recording sites. Figures 5⇓–7 show three different examples of task-specific long-range gamma-band amplitude correlations. The correlation analysis was restricted to regions responding to the task, as described before. In each ROI, we selected sites with the significant task sensitivity (SEM > VIS or PHO > VIS), and assigned them to one of the two networks, semantic or phonological or to both (according to the condition yielding the strongest response in that site). This task sensitivity is visible at the single-trial level (Fig. 5B, 6B, 7B). We found that seven patients had sites with similar task sensitivity, but distant anatomical location. We then considered in each patient pairs of sites with similar sensitivity (for a total of 28 pairs across all patients), measured the strength of the correlation between their gamma-band responses in every trial, and compared those measures in the three conditions. Significant correlation coefficients were defined relative to trial-shifted surrogate data (see Materials and Methods). We found that 64% of pairs had significantly correlated gamma-band responses (18 of 28), when considering their preferred condition. This percentage fell to 25% when considering their least preferred condition (e.g., the VIS condition for a MTG/IFC pair). This shows that correlation between regional gamma-band amplitude (50–150 Hz) changes was not systematic but dependent upon task demands. Figures 5C and 6C show three single-trial examples of gamma-band amplitude correlations. Figure 7C displays an example of anticorrelation. This was confirmed by a Kruskal–Wallis testing the effect of condition on correlation values: the test was significant for all pairs (28 of 28) with stronger correlation coefficients in the preferred condition in each case (Figs. 5D, 6D, 7D). This statistical strength of the correlations was visible at single-trial level (Fig. 8B). A resume of the gamma-band correlations connecting the previously mentioned nodes in the reading network in this study are shown in Figure 8A.
To rule out any spurious volume conduction effect, we conducted a separate analysis measuring correlation coefficients between random sites as a function of the distance between them (see Materials and Methods). We found that the probability of two random sites to be significantly correlated fell to chance level (p < 0.05 uncorrected for multiple comparisons) at distances >20 mm (all 28 pairs selected in the previous analysis were >20 mm apart). One of the strongest correlation values on the real data (r = 0.62) was found between a prefrontal site and a temporal site separated by >8 centimeters (Fig. 7A–D) [Talairach coordinates: IFC (−41 38 13) and MTG (−50 −45 −12)]. One remaining concern was that correlations could be due to a common driving of neural activity by reading saccades. Since our significance criterion was based on a comparison with trial-shifted surrogates, our analysis was vulnerable to such effect, because different trials are constrained by different oculomotor dynamics and correlation between nonmatched trials might be low for this reason. However, the typical fixation duration ranged between 200 and 300 ms (see saccade timing distributions; Fig. 1B). If saccades had been driving neural activity in the reading system in a correlated fashion, high correlation values would be due to such rhythmic component. This was not the case. We performed the same correlation analysis as above, after removing frequency components >3 Hz from gamma-amplitude modulations. All significant results remained significant, and all nonsignificant results remained nonsignificant.
To determine whether amplitude correlations were due to coherent amplitude fluctuations in a specific frequency range, we performed a coherence analysis in each single trial between gamma-band amplitude time series (between 0 and 3000 ms following stimulus onset). For every pair of recording sites detected by the amplitude correlation analysis, we compared the distribution of coherence values in two frequency bands (alpha, 8–12 Hz; beta, 16–24 Hz) with trial-shuffled surrogates, and found no significant values (p > 0.1, uncorrected for multiple comparisons). We conclude that amplitude correlations occur between nonrhythmic modulations of gamma-band activity.
Overall, our study shows that energy modulations in the gamma band are correlated in more than one-half of the pairs (64%) belonging to the same functional network. This correlation is not due to volume conduction or saccadic eye movement. It is also task dependent: two subregions of the same network are predominantly correlated when this network is actively processing information.
Discussion
We used direct, high-resolution, neural recordings to reveal two novel properties of the reading network: (1) it is characterized by systematic increases in neural activity as soon as an individual reads a sentence, in a broad (40–150 Hz) gamma frequency range; (2) neural responses induced by the reading process are correlated within the network, across long distances and in a task-dependent fashion. We will now discuss those findings, starting with long-range interactions.
A previous study (Nir et al., 2008) had shown that spontaneous gamma-band amplitude fluctuations are correlated between homologous sensory (auditory) cortices during rest and sleep. The novelty of our findings is that such amplitude correlations occur also during high-level cognition and are strongly task specific. Our results complement a recent meta-analysis of the language networks (Price, 2010) pointing out the current lack of knowledge about the long-distance cortical interactions supporting language processing. Correlated metabolic demands had been reported within the reading network since the mid-1990s (Bullmore et al., 1996), but our study provides novel insights regarding how neural populations interact during reading. We demonstrate correlated fluctuations of neural activity between the inferior frontal gyrus (IFG) and the MTG during semantic analysis and between the DLPFC, IFG, SMG, and VOTC during phonological analysis. Interestingly, the latter observation fits with a recent report that cortical potentials evoked by intracranial electrical stimulations show a direct anatomical connection between basal temporal and perisylvian cortex (Koubeissi et al., 2011).
We found some sites in the DLPFC and the SMA with strong and sustained energy increase throughout the task, with no difference in timing nor amplitude between experimental conditions. However, they should not be mistaken for sites preferentially active in the semantic condition in the LPFC. Numerous fMRI and animal electrophysiology studies have shown that subregions of the DLPFC support working memory and task-set maintenance, and are generally active during most types of goal-driven behavior (Grosbras et al., 1999; Sakai, 2008), might support goal-driven behavior involving eye movements, as required by each of the three experimental conditions.
Long-distance amplitude correlations have received relatively little interest compared with long-distance phase synchronization (Salmelin and Kujala, 2006; Kujala et al., 2007), partly because synchrony is believed to be the main mechanism for neural integration across distant brain regions (Varela et al., 2001). But phase coupling cannot be easily defined for broadband signals ranging between 40 and 150 Hz. Our approach rather draws from earlier findings that trial-to-trial amplitude fluctuations of visual gamma-band responses are correlated between temporal and parietal regions (Lachaux et al., 2005) and from previous demonstrations that amplitude and phase-based neural connectivity can reflect different underlying mechanisms. Gamma-amplitude modulations have been associated with the emergence of local communication processes between nearby neurons (Singer, 1999), and with increased local neural activity (increased spike count) (Miller, 2010). By extension, gamma-amplitude correlations would then be a natural reflection of the parallel organization of the brain: neural populations would interact maximally when they actively process information. The neural mechanism we propose here might co-occur with a phase synchronization phenomenon in narrow frequency bands (Salmelin and Kujala, 2006; Kujala et al., 2007), which has been associated with other mechanisms of neural communication (Fries, 2005, 2009). Also, because gamma-band activity has been shown to correlate with BOLD signal (Logothetis et al., 2001; Mukamel et al., 2005; Niessing et al., 2005), gamma-amplitude correlations could be a neural correlate of the BOLD co variations revealed by fMRI functional connectivity measures.
Neuroimaging studies often report group statistics that exclude intersubject regional and functional variability. These specificities can only be revealed by comparing adjacent neural responses in individual subjects across different task conditions. iEEG has often proven to be a powerful approach for this purpose (Sahin et al., 2009) and has provided here these advantages when analyzing the left ventral occipitotemporal cortex and the middle temporal gyrus.
The left VOTC is a major node of the reading network, with preferential responses to word form visual stimuli in a region lateral to the midportion of the fusiform gyrus, often called “visual word form area” (VWFA) (Cohen et al., 2000, 2002). Our results are consistent with such specialization, but interindividual response variations led us to dissect this broad area into three regions of interest containing recording sites along the middle-lateral axis. The latter occurs repeatedly as a consequence of the depth-electrode implantation procedure.
We found reading-specific responses in the middle and lateral VOTC on each side of an intermediate region showing little or no preference for word forms. Detailed anatomical analysis in the two participants with consecutive sites on a single electrode sampling all three regions revealed a clear-cut frontier in the collateral sulcus, separating middle responses in the lingual gyrus and nonspecific responses in the fusiform gyrus. Therefore, lingual sites responded most vigorously in the two conditions that required reading processes, phonological and semantic. This is in disagreement with earlier findings suggesting that the lingual gyrus has no specificity for word forms, and might even respond stronger to false-font stimuli (Vinckier et al., 2007). One possible explanation is that BOLD signals might be biased by the strong initial transient peak of lingual responses, 200 ms after stimulus onset, which is similar for word forms and random consonant strings. This common transient might underlie a global identification process of the visual stimulus, to determine whether it contains a word form. iEEG reveals that this initial step is followed by a second period of neural activity specific to word forms. An intriguing result, however, is that the later part of the neural response is equally sustained in both phonological and semantic conditions. Yet stimuli in the phonological condition are for the visual system meaningless letter groups (e.g., “NNN”): the word form emerges only when individual letters are progressively assembled in visual short-term memory (“P” + “H” + “O” + “N” + “E” = “PHONE”). This suggests that the lingual gyrus reacts to word forms, both as actual visual inputs (i.e., a fully written word) and as purely mental constructs (i.e., at the final stage of the assembly process required by the phonological condition). Lingual GBA activation in the phonological condition might then correspond to a searching process within an internal lexicon of word forms.
The most lateral VOTC GBA responses were slightly delayed compared with responses in the lingual gyrus. They also occurred only in the phonological and semantic conditions, and were stronger in the former, with some sites showing even no response to words in the semantic condition. This finding was unexpected, considering that this region of interest lies within the limits of the visual word form area (Cohen et al., 2000, 2002; Mainy et al., 2008; Vidal et al., 2010). The VWFA is believed to support word form recognition, while lateral GBA responses in our study are more consistent with either phonological or memory processes. The phonological condition includes a strong visual working memory component, since individual letters must be assembled mentally into a word. This suggests that the lateral VOTC might serve as a visuospatial sketchpad for that purpose, as proposed by Baddeley (2010). This is consistent with recent findings from our group that gamma-band activity increases in this region with memory load during a visuospatial working memory task (Hamamé et al., 2012). The additional observation that this region is strongly correlated with the DLPFC, suggests a large-scale fronto-temporal network supporting visual working memory (Pasternak and Greenlee, 2005). These findings suggest that the VOTC contains three subregions involved in reading, in the lingual gyrus and in the lateral VOTC, in addition to the VWFA. The lingual gyrus could subserve lexical retrieval, while the lateral VOTC could support visual working memory processes.
All gamma-band activity responses reported in this study fit nicely with preexisting neuroimaging literature on language processing during reading, and with previous iEEG studies on single-word identification, including the clear dissociation between semantics and phonology in the inferior frontal lobe (Sahin et al., 2009). Recruitment of the IFG and SMG during phonological processing is also largely established, for both grapheme-to-phoneme conversion and auditory working memory (Hickok and Poeppel, 2007; Mainy et al., 2008; Hickok, 2009; Price, 2010; Juphard et al., 2011). Here, we extend these considerations by reporting as a network of correlated neural activations. It was less expected, however, that a region comprising the middle temporal gyrus and the superior temporal sulcus would be active during sentence processing, but not in response to isolated words—a notable discrepancy with our previous single-word study (Mainy et al., 2008). During sentence comprehension, active sites in the MTG defined a broad cortical territory, reaching the canonical Wernicke's area in its most posterior extension. Within this region, our simple experimental design did not reveal any functional heterogeneity, but rather, a highly integrated network, with strong site-to-site gamma-amplitude correlations (r = 0.62) in the semantic condition. If the MTG comprises subcomponents, they must exchange information rapidly during sentence comprehension. The exact functional role of the MTG is thus not yet completely clear. The MTG is deactivated in many tasks requiring attentive processing of external stimuli, such as visual search tasks similar to the visual condition in our task (Ossandon et al., 2011). For this reason, the MTG has been often associated with the default-mode network (Ossandon et al., 2011), in relation with endogenous, mind-wandering processes such as covert, imaginary narration involved in “story-building.” This hypothesis is consistent with previous reports (Price, 2010) that this region is specifically activated during sentence comprehension, when consecutive words are integrated into a coherent story and not for single, isolated words. The MTG cluster included the superior temporal sulcus, which is associated with social cognition, and dysfunctional in autism (Zilbovicius et al., 2006). Our interpretation is that MTG plays a central role in elaborating little scenarios, on a visual basis, that naturally take place when understanding concrete sentences describing people's actions (Redcay, 2008).
Footnotes
This work was supported by the Agence Nationale de la Recherche, CONTINT “OpenVibe2”, ANR blanc “MLA” (J.-P.L.), and Fondation Fyssen (C.H.). We thank all patients for their participation; the staff of the Grenoble Neurological Hospital epilepsy unit; and Dominique Hoffmann, Patricia Boschetti, Carole Chatelard, and Véronique Dorlin for their support.
The authors declare no competing financial interests.
- Correspondence should be addressed to Dr. Juan R. Vidal, Institut National de la Santé et de la Recherche Médicale Unité 1028, Centre de Recherche en Neurosciences de Lyon, Equipe Dynamique Cérébrale et Cognition, Centre Hospitalier le Vinatier, Bâtiment 452, 95 BD Pinel, F-69500 Bron, France. juan.vidal{at}inserm.fr