Several studies show that the amplitudes of human brain oscillations are modulated during the performance of visual working memory (VWM) tasks in a load-dependent manner. Less is known about the dynamics and identities of the cortical regions in which these modulations take place, and hence their functional significance has remained unclear. We used magnetoencephalography and electroencephalography together with minimum norm estimate-based source modeling to study the dynamics of ongoing brain activity during a parametric VWM task. Early stimulus processing and memory encoding were associated with a memory load-dependent spread of neuronal activity from occipital to temporal, parietal, and frontal cortical regions. During the VWM retention period, the amplitudes of oscillations in theta/alpha- (5–9 Hz), high-alpha- (10–14 Hz), beta- (15–30 Hz), gamma- (30–50 Hz), and high-gamma- (50–150 Hz) frequency bands were suppressed below baseline levels, and yet, in frontoparietal regions, load dependently strengthened. However, in occipital and occipitotemporal structures, only beta, gamma, and high-gamma amplitudes were robustly strengthened by memory load. Individual behavioral VWM capacity was predicted by both the magnitude of the N1 evoked response component in early visual regions and by the amplitudes of frontoparietal high-alpha and high-gamma band oscillations. Thus, both early stimulus processing and late retention period activities may influence the behavioral outcome in VWM tasks. These data support the notion that beta- and gamma-band oscillations support the maintenance of object representations in VWM whereas alpha-, beta-, and gamma-band oscillations together contribute to attentional and executive processing.
Working memory (WM) maintains and manipulates information (Baddeley, 1996) and is based on the retention of integrated neuronal object representations (Luck and Vogel, 1997; Zhang and Luck, 2008). The binding of sensory feature representations into object representations and the active maintenance of these representations in WM may depend on millisecond-range synchronization of neuronal network oscillations (Fries, 2009; Singer, 2009; Palva et al., 2010b). In human magnetoelectroencophalography (MEG) and electroencephalography (EEG) recordings, the maintenance of visual (Tallon-Baudry et al., 1998; Osipova et al., 2006; Jokisch and Jensen, 2007; Medendorp et al., 2007; Haenschel et al., 2009), auditory (Leiberg et al., 2006; Kaiser et al., 2008) and somatosensory (Haegens et al., 2010) information in WM is associated with increased beta- and gamma-frequency band amplitudes. However, the amplitudes of theta- and alpha-band oscillations also are modulated during WM maintenance (Jensen et al., 2002; Busch and Herrmann, 2003; Onton et al., 2005; Leiberg et al., 2006; Osipova et al., 2006; Jokisch and Jensen, 2007; Haenschel et al., 2009). Prior studies have used several distinct WM tasks and memory retention intervals, which make it difficult to obtain an integrative view on the insight given by these studies into brain functioning. The difficulties in inferring the physiological roles of scalp-recorded oscillations also stem from the absence of source modeling in most studies. Studies using MEG/EEG (M/EEG) source modeling suggest that the amplitude modulations in visual WM (VWM) tasks arise in occipital and parietal brain regions (Osipova et al., 2006; Jokisch and Jensen, 2007; Medendorp et al., 2007). Functional magnetic resonance imaging (fMRI) has, in addition, revealed VWM maintenance-related activity in several frontal and temporal regions as well as in the cingulate and insular cortices (Prabhakaran et al., 2000; Rowe et al., 2000; Munk et al., 2002; Pessoa et al., 2002; Sakai et al., 2002; Linden et al., 2003; Todd and Marois, 2004; Mohr et al., 2006; Axmacher et al., 2007). Intracranial recordings of epileptic patients also show amplitude modulations during VWM maintenance in widespread occipital, temporal, parietal, and frontal cortical regions (Raghavachari et al., 2001; Howard et al., 2003; Mainy et al., 2007; Sederberg et al., 2007; Meltzer et al., 2008), and in the hippocampus (Axmacher et al., 2007). Indeed, by using fMRI priors in source modeling, sustained event-related activity (Wibral et al., 2008; Scheeringa et al., 2009; Robitaille et al., 2010) as well as theta- (Morgan et al., 2010) and alpha-band (Grimault et al., 2009) amplitude modulations have been identified in occipital, temporal, and frontal cortical regions.
In the present study, we set out to characterize comprehensively the neuronal substrates underlying a parametric and psychophysically well validated VWM task using concurrently acquired M/EEG data. We used data-driven statistical source space analyses to map in individual cortical anatomy the dynamics of early stimulus processing and later amplitude modulations of ongoing oscillations. The investigation aimed at identifying the cortical regions and time-frequency windows where brain activity is modulated by VWM task performance, correlated with memory load, and/or predictive of individual behavioral performance.
Materials and Methods
Task and stimuli.
We used a delayed matching-to-sample task where the subjects' were presented a sample stimulus that contained one to six squares, and had to be sustained in WM and then compared with a test stimulus (Luck and Vogel, 1997; Palva et al., 2010a,b) (supplemental Fig. 1A, available at www.jneurosci.org as supplemental material). The sample stimulus was presented for 0.1 s and was followed by a 1 s memory retention period, after which a test stimulus was presented for 0.5 s. The subjects indicated with a forced-choice left- or right-hand thumb twitch whether the stimuli were “same” or “different.” The hand response correspondence was randomized across subjects. After this primary response, the subjects indicated whether or not they felt sure about their response with a second go/no-go, respectively, twitch of the same thumb.
Accuracy was assessed as the proportion of correct responses from among all responses (supplemental Fig. 1B, available at www.jneurosci.org as supplemental material). Reaction time was given by the time from the sample stimulus onset to the onset of the electromyogram (see Subjects and recordings, below) activity corresponding to the primary response (supplemental Fig. 1C, available at www.jneurosci.org as supplemental material). All analyses were based on artifact-free trials with correct behavioral responses.
Subjects and recordings.
Thirteen healthy right-handed (age, 28 ± 3 years, mean ± SD; 4 females) volunteers with normal or corrected-to-normal vision participated in this study. The 366-channel M/EEG data were recorded with 204 planar gradiometers, 102 magnetometers, and 60 EEG electrodes (Elekta Neuromag) at a 600 Hz sampling rate throughout the experiment (supplemental Fig. 1D, available at www.jneurosci.org as supplemental material). The thumb-twitch responses were recorded with electromyography (EMG) of abductor/flexor pollicis brevis and detected with an automatic algorithm. All EMG responses were also verified visually. The electro-oculogram was used to detect ocular artifacts. Trials with the electro-oculogram signal exceeding 50 μV were excluded from further analysis. The MaxFilter software (Elekta Neuromag) was used to suppress extracranial noise and to colocalize the signal space data from different recording sessions and subjects. For cortical surface reconstructions, we recorded T1-weighted (magnetization-prepared rapid-acquisition gradient echo) anatomical MR images at a ≤1 × 1 × 1 mm resolution with a 1.5 T MRI scanner (Siemens) (supplemental Fig. 1E, available at www.jneurosci.org as supplemental material). This study was approved by the ethical committee of Helsinki University Central hospital and was performed according to the Declaration of Helsinki. Written informed consent was obtained from each subject before the experiment.
Data analysis, Morlet wavelet transform.
Each channel, yj(t), of the single-trial M/EEG time-series data, Y(t), with nc channels j = 1 … nc, was filtered into 36 frequency bands, f, f = 3…90 Hz, with a bank of Morlet wavelets, h(t,f), so that the complex filtered signal yF(t,f) is given by yF(t,f) = y(t) * h(t,f), where * denotes convolution and h(t,f) = A exp(−t2/ 2σt2)exp(2iπft) (supplemental Fig. 1G, available at www.jneurosci.org as supplemental material). The time-domain SD of the wavelet is given by σt = m/2πf, where we selected m = 5 to define a compromise between time and frequency resolution, and f is the center frequency of the wavelet.
Finite impulse response filtering.
In the second stage of this study, we used a set of paired high- and low-pass finite impulse response (FIR) filters that gave a better temporal resolution than the Morlet wavelets and represented the spectral content captured by four adjacent wavelets with a single filter in each of the four frequency bands of interest delineated by the prior wavelet analysis. The stop and pass bands of the high-pass filters, and the pass and stop bands of the low-pass filters, respectively, were as follows (in Hz): theta/alpha [2.375, 4.75, 6.65, 9.5]; high-alpha [4.75, 9.5, 13.3, 19.0]; beta [9.5, 19.0, 26.6, 38.0]; gamma [14.25, 28.5, 37.05, 49.875]; and high-gamma [48.75, 75.0, 112.5, 150.0]. The parameters for the FIR filters used in evoked response (ER) analyses were (in Hz) [0.01, 0.1, 25, 45] and [1, 2, 25, 45].
Forward and inverse modeling.
The FreeSurfer-anatomical-MRI analysis software (http://surfer.nmr.mgh.harvard.edu/) was used for automatic volumetric segmentation of the MRI data, surface reconstruction, flattening, cortical parcellation, and labeling with the Freesurfer/Destrieux atlas (Dale et al., 1999; Fischl et al., 1999; Fischl et al., 2001, 2002, 2004; Ségonne et al., 2004; Desikan et al., 2006) (supplemental Fig. 1H, available at www.jneurosci.org as supplemental material). The MNE software (http://www.nmr.mgh.harvard.edu/martinos/userInfo/data/sofMNE.php) was used for creating three-layer boundary element conductivity models, cortically constrained source models, for the M/EEG-MRI colocalization, and for the preparation of the forward and inverse operators (Hämäläinen and Sarvas, 1989; Hämäläinen and Ilmoniemi, 1994; Nenonen et al., 1994; Mosher et al., 1999; Lin et al., 2006) (supplemental Fig. 1I,J, available at www.jneurosci.org as supplemental material). The source models for each individual were based on tessellated and decimated cortical surfaces where the dipoles were fixed normal to the surface and had an ∼7 mm source-to-source separation. Depending on the brain size, the source models had a total of nd = 6000–8500 sources in the two cerebral hemispheres.
M/EEG sensor signals, Y, are linearly related to current strengths in nd source dipoles, X = [xk], k = 1 … nd, so that Y(t) = ΓX(t) + N(t), where Γ is the lead field matrix (i.e., the forward operator), and N denotes noise. We obtained X(t) from measured Y(t) by using a minimum-norm estimator (Hämäläinen and Sarvas, 1989; Nenonen et al., 1994), so that X(t) = ΜY(t) = CsΓT(ΓCs ΓT + λ2Cn)−1Y(t), where Μ is the inverse operator, λ2 is a regularization parameter, Cs is the source covariance matrix, and Cn is the noise covariance matrix. We used λ2 = 0.05 and a multiple of the identity matrix as R, scaled so that the norm of ΓCs ΓT, whitened spatially using Cn, equals the number of channels (see MNE manual). The inverse operator, Μ, was prepared separately for each wavelet filter or FIR filter frequency with the noise covariance matrices obtained from filtered single-trial prestimulus baseline windows of all trials (supplemental Fig. 1K, available at www.jneurosci.org as supplemental material). The complex inverse solution, XF = [xF,k], was obtained from the inverse estimates of the real, YF,RE, and imaginary, YF,IM, parts of the filtered data YF, so that XF = ΜF YF,RE(t) + i ΜF YF,IM(t), where i is the imaginary unit (supplemental Fig. 1L, available at www.jneurosci.org as supplemental material).
We used two cortical parcellation strategies in this study. The aims of source-model parcellation were to compress single-trial data and to transform the data into nameable anatomical regions that were common to all subjects. A parcellation denotes the grouping of vertices into a number of patches. In the Morlet wavelet-based analyses, we employed the 156 labels of the FreeSurfer atlas (Fischl et al., 2002) covering the entire cortical surface (see supplemental Fig. 1H, available at www.jneurosci.org as supplemental material) and then further split the largest patches to obtain the anatomical parcellation, PAN240, comprising a total of 240 patches (supplemental Fig. 1M, available at www.jneurosci.org as supplemental material). The splitting was carried out iteratively so that in each iteration the patch having the greatest mean number of vertices was selected and split along the one of the three axes (anterior–posterior, lateral–medial, ventral–dorsal) that had the largest mean variance of vertex locations. This procedure gives an anatomical parcellation wherein the cut directions are identical for every subject. The individual anatomical data were thus directly comparable in group statistics of the subject population without a need for morphing the data across subjects (Palva et al., 2010b). The FIR filter-based analyses were performed with two parcellations: the single-trial data were compressed to and averaged with an individual cluster parcellation of 700 clusters, PCL700; group statistics were then performed with an anatomical parcellation with 700 patches, PAN700, after morphing the data from each individual's PCL700 to that individual's PAN700 (Palva et al., 2010b). The individual subject's parcellations were obtained by (1) inverse modeling baseline data that was FIR filtered into a frequency band from 150 to 224 Hz (filter parameters as above: [96.85, 149.0, 223.5, 298.0]), (2) evaluating vertex-by-vertex phase-locking, value-based, 1:1 phase-synchrony matrices within each of the 150 patches of the Destrieux atlas (3) performing hierarchical clustering of source space vertices so that always the pair of vertices (or vertex clusters) that had the greatest phase-locking value were clustered. The hierarchical clustering was continued until 700 clusters remained. The hierarchical clustering approach was thus similar to the one described earlier (Palva et al., 2010b), but here the clusters were limited to the 150 anatomical patches of the Destrieux atlas. This clustering compressed single-trial source space data with minimal loss of information because the clusters reflected vertices that were highly correlated in the original recording and/or in the inverse transform. The 150–224 Hz band was used for this purpose because the amplitude [and signal-to-noise ratio (SNR)] of neuronal activity therein is negligible and hence the phase-synchrony estimates describe very well the correlations inherently present in the M/EEG inverse solutions.
The time series of the ∼7000 source vertices were collapsed into time series of the cortical patches (supplemental Fig. 1N, available at www.jneurosci.org as supplemental material) so that the collapsed inverse solution, XF,P = [xF,P,l], l = 1 … np, is given by XF,P = Π(XF, P), where Π is a collapse operator, P is the parcellation, P = [pl], and np is the number of patches, pl, in P. In the Morlet wavelet analyses, we defined Π so that the patch time series was obtained from vertex time series simply by averaging over the time series of the individual vertices of within that patch. In the FIR filter-based analyses, Π was defined so that the amplitudes of the individual vertex time series were averaged as above but the phases of a subset of vertices were shifted by π. This was done because in a source model with fixed dipole orientations, the time series for a single dipole in one wall will be 180° out of phase with the time series of the dipole in the opposite wall. These to-be-shifted vertices were identified from the within-patch phase synchrony estimates as that minority of vertices that had a phase difference of approximately π with the majority of vertices having a zero-lag phase difference. Vertices with an opposite polarity arise in cortical patches where the patch, for instance, contains both sides of the same sulcus. The phase shifting prevents vertices with opposite polarities from canceling each other when the vertex time series are averaged into the patch time series.
Evoked response, amplitude, and stimulus-locking analyses.
We used the collapsed inverse estimates, XF,P,r(t,f), of single trials r, r = 1 … ns, for cortex-wide mapping of evoked responses and the phase locking of ongoing activity to the sample stimuli [hereafter called “stimulus locking” (SL)] as well as for the mapping of event-related amplitude dynamics (supplemental Fig. 1O, available at www.jneurosci.org as supplemental material). Stimulus locking was quantified with the phase-locking value, PLV(t,f) (also known as the phase-locking factor), that was given for each patch a = 1 … np by PLV = ns−1|Σr (xF,P,a/|xF,P,a|)| (Sinkkonen et al., 1995). The averaged event-related amplitude envelopes, A(t,f), were given by A = ns−1Σr (|xF,P,a|) (Tallon-Baudry et al., 1996; Palva et al., 2005) and the evoked responses, ER(t,f), simply by the averaging the real parts of the time series so that ER = ns−1Σr [Re(xF,P,a)],
Group statistics: primary statistical analyses.
Before the statistical group analyses (Figs. 1, 2; see Figs. 4, 5, 7; supplemental Figs. 1 P, 2–8, available at www.jneurosci.org as supplemental material), the individual subject's PLV(t,f), A(t,f), and absolute-value evoked responses (ER) were baseline corrected by subtracting the mean of the pre-sample-stimulus period from −450 to −150 ms (Morlet wavelet data) or from −700 to −100 ms (FIR filter data). The data were then pooled across subjects by applying a statistical test to obtain PLVS and AS for each time frequency element of PLV and A, and separately for each VWM load, LM = 1, 2, 3, 4, 5, and 6. In the average condition and for all VWM loads separately, the baseline-corrected PLV, A, and ER were tested against a null hypothesis of PLV = 0, A = 0, and ER = 0 by using the Wilcoxon signed rank or Student's t test. These conditions thus address whether the studied values are significantly different from the mean baseline value. The correlations of the PLV, A, and ER values in each time, frequency, and cortical location with memory load (load condition) were estimated by using the Spearman's rank correlation test across the memory loads LM = 1, 2, 3, 4, 5, and 6 so that the values were first rank transformed within each subject and then used in the statistical group analysis across subjects. Individual variability in response magnitudes thus did not influence the outcome of the analysis.
Post hoc analyses.
Post hoc analyses were performed to a selection of brain regions showing significant modulation by the VWM load (Fig. 3; see Fig. 6; supplemental Figs. 9, 10, available at www.jneurosci.org as supplemental material). The peak latencies and amplitudes were estimated from the peaks searched from the signal in two time windows (80–120 and 160–200 ms, respectively, for the first and second peak). The mean ER magnitudes were estimated from the same time windows, whereas the mean oscillation amplitudes were estimated from the late retention period time window (700–1000 ms). The selection of time window did not affect the results qualitatively.
The correlations of the ER and oscillation amplitude with the individual memory capacity were identified so that we first obtained mean ER and A values separately for each of the six memory loads from 1 to 6, fitted a third-order polynomial to this mean-as-a-function-of-load series, and subtracted the linear trend. The “predicted capacity values” were then obtained by finding the first positive peak. The correlation between the predicted and real capacity values was estimated by the Spearman rank correlation test.
During the M/EEG recordings, the subjects performed a delayed match-to-sample VWM task (supplemental Fig. 1A, available at www.jneurosci.org as supplemental material) (Luck and Vogel, 1997; Palva et al., 2010a,b) wherein they memorized a 0.1 s sample stimulus containing one to six colored squares. A test stimulus appeared 1 s after the offset of the sample, and in 50% of the trials one square in the test had a different color than that square in the sample. The subjects indicated with a forced-choice left- or right-hand thumb twitch whether or not the sample was identical to the test. As reported for earlier (Luck and Vogel, 1997) and present (Palva et al., 2010b) data, the behavioral accuracy decreases with increasing memory load (supplemental Fig. 1B, available at www.jneurosci.org as supplemental material), while the reaction times become longer (supplemental Fig. 1C, available at www.jneurosci.org as supplemental material). Thus, this VWM task yields quantitatively similar psychophysical data during repeated M/EEG recordings as obtained outside the neuroimaging context.
Stimulus-processing and memory-encoding period
The aim of this study was to identify the cortical sources and dynamics of VWM-related neuronal activities and to address their functional significance in VWM by finding the components that are correlated with memory load and/or predict individual psychophysical performance. The processing of the sample and test stimuli was examined first by evaluating absolute-value ERs with two broad-band FIR bandpass filters (1–45 and 0.01–45 Hz) across the six memory load conditions. The data were summarized into waveforms indicating the fraction of brain regions where the ERs were significantly [p < 0.001, false discovery rate (FDR) corrected] above the baseline level. Both sample and test stimuli were associated with ERs lasting approximately the first 400 ms from stimulus onset (Fig. 1A). Interestingly, the ERs obtained with the lower high-pass filter (0.01–45 Hz) remained above the baseline throughout the retention period and thus contained a slow baseline shift-like component. We then estimated the effect of memory load on the evoked responses by using Spearman's rank correlation test (p < 0.001, FDR corrected), which revealed that the ER components from 60 to 200 ms, but not those after 200 ms from stimulus onset, were strongly modulated by the number of objects in sample stimuli (Fig. 1B). One should note that in the present experiment the number of objects covaried with the overall complexity of the stimulus and hence the strengthening of ERs with increasing memory load may be at least partly attributed to a concurrent change in physical stimulus properties.
ERs are generally thought to be generated by additive evoked activity but may also be influenced by event-related amplitude changes of non-zero mean oscillations (Nikulin et al., 2007; de Munck and Bijma, 2010) and phase resetting (Makeig et al., 2002) of ongoing activity. To consider the contributions of these mechanisms into the generation of the ERs in the present VWM task, we quantified the phase locking of ongoing activity to the sample and test stimulus onsets (i.e., SL) and evaluated the peristimulus amplitude dynamics by using the same broad-band filters that were used for ERs (Palva et al., 2005). The dynamics of SL was similar to that of ERs, but SL was significant in a greater fraction of cortical regions (Fig. 1C). The amplitude of ongoing cortical broad-band activity, on the contrary, was only briefly strengthened above the baseline level at around 100 ms after the sample stimulus onset and thereafter suppressed below the baseline level for the rest of trial. Like ERs, both SL and the early amplitude enhancement were robustly correlated with memory load (Fig. 1D).
We next identified the cortical regions involved in the processing of sample stimuli. The first significant ER components (p < 0.001, FDR corrected) were found at around 70 ms. As the broad-band filter provided a good temporal resolution, we identified the ER sources in 10–20 ms steps (supplemental Figs. 2, 3, available at www.jneurosci.org as supplemental material). The evoked activity spread rapidly from V1/V2 to higher level extrastriate, occipitotemporal, and parietal regions. One should note here, however, that the inverse modeling method used here (MNE) provides a distributed source estimate that is likely to be more widespread than the true cortical source. Hence, the source reconstructions contain cross talk between neighboring cortical patches in spatial scales of a few centimeters (Hauk et al., 2011). The P1 peak at 120 ms was observed in distributed visual areas in the occipital and inferior temporal cortices including the lateral occipitotemporal cortex (LOT), inferotemporal gyrus (ITG), and the intraparietal sulcus (IPS), as well as in the insular and cingulate cortices (Fig. 2A). The N1 peak at 190 ms was observed also in distributed regions in the posterior parietal cortex (PPC) as well as in central and superior temporal sulci. This constellation of occipital, temporal, and parietal regions was sustained for approximately 50 ms, after which it gradually degraded. Broad-band SL was strongest in the same network of brain regions as the evoked response, but with a more smeared temporal evolution. This was expected because the phase estimates of ongoing activity, by definition, incorporate information across a time window that is dependent on the characteristic time scales of the activity itself. This may, in part, also explain the greater sensitivity of SL compared with ERs. It is interesting to also note that SL revealed early stimulus processing in several frontal cortical regions including the frontal eye fields (FEFs) and lateral prefrontal cortex (LPFC) as well as in anterior insula (supplemental Figs. 2, 3, available at www.jneurosci.org as supplemental material). In contrast, the brief broad-band amplitude enhancement was limited to striate cortex and extrastriate regions surrounding the occipital pole (supplemental Figs. 2, 3, available at www.jneurosci.org as supplemental material).
VWM load strengthened ERs, amplitudes, and SL mostly in the same regions that were identified in the average condition (Fig. 2B; supplemental Figs. 4, 5, available at www.jneurosci.org as supplemental material). While the memory load effect at P1 peak was limited to occipital, occipitotemporal, and posterior parietal cortices, the VWM load strengthened N1 peak additionally in superior temporal, superior and orbitofrontal, insular, and cingulate cortices. The correlation of amplitude with memory load was robust throughout the posterior cortical networks and well colocalized with the spatiotemporal evolution of ERs (supplemental Figs. 4, 5, available at www.jneurosci.org as supplemental material). The SL, in contrast, was also strongly correlated with memory load in several more anterior structures, including the anterior cingulate and insula as well as paracentral and superior precentral sulci (supplementary motor area and FEF, respectively). Hence, these results suggest that in these data in posterior brain regions, ERs could have been generated by additive stimulus-evoked activity, but in the anterior brain regions phase resetting of ongoing oscillations contributed to the generation of ERs.
Finally, we also localized the sustained slow evoked component that was salient in the ER obtained with 0.01–45 Hz filter (Fig. 1A). After the termination of the transient ER and in the time window from 600 to 1100 ms, the slow component was strongest in the LPFC and cingulate cortices as well as in paracentral, precentral, and postcentral areas (Fig. 2C). More posteriorly, the slow component was also observed in PPC and lingual gyrus.
Moreover, while the slow retention period ER component was largely colocalized with frontal load-dependent oscillation amplitude modulations (see Cortical sources of retention period modulations of oscillation amplitudes, below), it was uncorrelated with memory load, which suggests that this component might not be related to the amplitude changes of alpha or faster oscillations through the non-zero mean mechanism. The insensitivity to VWM load of the slow ER component suggests that it could reflect arousal- or alertness-related fluctuations (Monto et al., 2008; Schroeder and Lakatos, 2009).
Early evoked responses are correlated with VWM performance
To shed light on the functional significance of neuronal activity reflected in the ERs, we characterized the early components in more detail with post hoc analyses separately for each memory load condition in those cortical regions wherein the ERs were significantly correlated with memory load (Fig. 2B). We first averaged the absolute-value ER time series across this selection (Fig. 3A) and then used peak detection to estimate the mean latencies and amplitudes of the P1 and N1 peaks (at ∼120 and 190 ms, respectively) across the subject population (Fig. 3B). The peak latencies were relatively stable across the memory loads, and while N1 appeared to be earlier for higher VWM loads, this tendency was not significant. The peak amplitudes were enhanced with VWM load (Fig. 3C). We then focused the analysis into two anatomically defined regions of interest: one approximately containing V1 and V2 (calcarine fissure, cuneus, occipital pole, and lingual gyrus), and another comprising the remaining extrastiate visual regions in lateral occipital (LO) and inferior occipital (IO) cortices (Figs. 2B, 3D). The strength of P1 was larger in the V1/V2 than in the LO/IO selection (Fig. 3D), but in both regions of interest the strength of P1 and N1 was increased with the VWM load (Fig. 3E). Interestingly, while the amplitude of the P1 was essentially linearly dependent on the number of objects in the sample stimulus, the amplitude of N1 increased from one to three objects and plateaued thereafter. This plateau onset is near the mean behavioral memory capacity in these data (4.1 ± 0.2, mean ± SEM) (Palva et al., 2010b), and, hence, we investigated whether this plateau onset in individual subjects' ERs would be correlated with their VWM capacity. We estimated the plateau onset by detecting the first positive peak in the difference of third- and second-order polynomials that were least squares fitted to the series of ER peak amplitudes for the six memory loads (see Materials and Methods). The correlation of this ER-predicted capacity value with the real behavioral capacity value was then evaluated with Spearman's correlation test. Intriguingly, this analysis showed that strength of N1 was, indeed, positively correlated with memory capacity in V1/V2 but not in LO/IO (Fig. 3F). P1 was uncorrelated with behavioral capacity in both V1/V2 and LO/IO.
Because early ER components are well known to reflect such physical stimulus properties, the memory load dependence of P1 in these data cannot be used to infer a role for in VWM encoding. However, the correlation of N1 with individual behavioral memory capacity indicates that ERs in this latency range already reflect VWM-related processing and are not fully determined by the physical properties of the stimuli.
Oscillation dynamics in VWM
The principal aim of this study was to identify the cortical regions where non-stimulus locked ongoing oscillations are correlated with the VWM retention, memory load, or behavioral performance. ERs, reflecting phasic stimulus-locked brain activity, lasted until ∼400 ms from the sample stimulus onset and were VWM capacity sensitive already at ∼200 ms, suggesting that memory encoding could be completed before 400 ms. We thus consider the VWM retention period in these data to be from 400 to 1100 ms from the sample stimulus onset. The majority of studies on the role of oscillations in VWM have used retention periods of up to 3 s. It has, however, been suggested that the mechanism of memory maintenance may change after the first second of memory retention (Todd and Marois, 2004). The present study explores the neuronal substrates of this 1 s memory retention with an experimental paradigm that has been psychophysically well validated (Luck and Vogel, 1997; Todd and Marois, 2004; Vogel and Machizawa, 2004; Vogel et al., 2005). To grasp the temporal and spectral evolution of event-related ongoing activity, we first computed time-frequency (TF) representations (TFRs) of SL and oscillation amplitudes by using a bank of 36 Morlet wavelets from 3 to 90 Hz. Similar to the analyses of fMRI data (Kriegeskorte et al., 2009), TF analyses of MEG data are susceptible to statistical inflation arising from circular data inspection and region of interest selection. To have a noncircular discovery strategy, we first pooled the statistical TF data across all brain regions by plotting for each TF element the fraction of statistically significant patches (or M/EEG sensors). To control for multiple comparisons, we removed the number of significant TF elements predicted to be false discoveries at the chosen alpha level, Α. Because VWM is known to involve a widely distributed cortical network, this approach was expected to reveal those TF regions where large-scale cortical activity was modulated by VWM processing.
We first evaluated the changes from baseline level with Wilcoxon signed rank test for data averaged across the six memory loads. The TFR for SL showed that frequencies up to 40 Hz were phase locked to the stimulus onset for several hundreds of milliseconds from stimulus onset without a clear oscillatory component (A = 0.001, FDR corrected) (Fig. 4A). The TFR of oscillation amplitudes revealed an amplitude increase in the theta/alpha band during the first 400 ms after stimulus onset, which coincided with the period of strong stimulus-locked activity (A = 0.01, FDR corrected) (Fig. 4B). After this transient, the oscillation amplitudes were suppressed below the baseline level throughout the analyzed frequency range from theta/alpha- to gamma-frequency bands. This suppression was also evident in broad-band filtered data (Fig. 4). We also confirmed that the recording method would not affect the observed pattern of SL or oscillation amplitudes by computing the TFRs separately for MEG gradiometers, magnetometers, and EEG electrodes. All three recording methods converged with source modeling on a similar result for both SL and amplitude dynamics. The fraction of significant sensors was greater than the fraction of significant source patches, suggesting that the source modeling was successful in disentangling cortical sources from their mixed sensor level signals (Fig. 4A,B; supplemental Fig. 6, available at www.jneurosci.org as supplemental material).
To investigate the impact of memory load on the oscillatory dynamics, we identified the brain regions in which SL or amplitude dynamics were correlated with memory load by using the Spearman's rank correlation test with the null hypothesis that the SL or amplitude values in the six individual memory load conditions were uncorrelated among the memory loads from one to six objects. Before testing, the individual subjects' data were rank transformed to ensure comparability across the population. The TFR of SL revealed an early positive correlation with memory load (A = 0.001, FDR corrected) (Fig. 4C). The TFR of memory load-dependent amplitude changes showed an interesting pattern. Similar to SL and ERs, the amplitudes of theta/alpha- and high-alpha-frequency bands were positively correlated with memory load during the first 200 ms after stimulus onset (A = 0.01, FDR corrected) (Fig. 4D). The early positive effect thereafter changed into a negative correlation in the theta/alpha, high-alpha, and beta bands, which indicates that the amplitude suppression was even deeper at higher memory loads. The negative correlation, however, was sustained through the VWM retention period only in the theta/alpha band. The amplitudes of high-alpha and beta bands were positively correlated with memory load from 400 ms to the end of the VWM retention period, even though they were suppressed below the baseline level in the average condition.
On the basis of the TF mapping, we designed a set of broad-band FIR filters to capture the theta/alpha-, high-alpha-, beta-, gamma-, and high-gamma-frequency bands with a temporal resolution greater than that provided by the wavelets, and we also used a patch collection with a spatial resolution higher than was used in the wavelet-based analysis (see Materials and Methods). The amplitude time series obtained with these filters reproduced very well the phenomena discovered with wavelets (A = 0.01, FDR corrected) (Fig. 4E), which shows that the results presented here are not qualitatively affected by the filtering or cortical parcellation approaches. However, the dynamics obtained with these broad-band filters revealed that also gamma- and high-gamma-band amplitudes were strengthened with increasing memory load.
Cortical sources of retention period modulations of oscillation amplitudes
We used the FIR filtered theta/alpha-, high-alpha, beta-, gamma- and high-gamma-band data to identify the cortical regions underlying the amplitude modulations during VWM retention. In data averaged across all six memory loads, the amplitudes were suppressed in distributed visual regions in striate, extrastriate, temporal, and parietal cortices (A = 0.01) (Fig. 5A). In the beta and gamma bands, the suppression was most pronounced in medial occipital and parietal cortices, but also observed in occipitotemporal and temporal visual areas. In addition, the suppression was observed in the same cortical regions during the early (400–700 ms from sample stimulus onset) and late (700–1000 ms from sample stimulus onset) parts of the retention period (supplemental Fig. 7, available at www.jneurosci.org as supplemental material).
Increasing memory load was associated with a further suppression of amplitudes in the theta/alpha- and high-alpha bands in the occipital, occipitotemporal, and parietal regions (Fig. 5B). The amplitudes in the high-alpha, beta, gamma, and high-gamma bands, and very weakly in the theta/alpha band, were strengthened in widespread frontoparietal cortical regions, including the LPFC, precentral and postcentral regions, and cingulate, insular, and orbitofrontal cortices. Interestingly, the beta- and gamma-band amplitudes were, in addition, strengthened in several visual regions in the occipital and occipitotemporal, and temporal cortices that are known to play a central role in the formation of visual object representations (Riesenhuber and Poggio, 2002; Grill-Spector and Malach, 2004; Konen and Kastner, 2008). While the anterior positive correlations remained very similar at A = 0.001, the posterior positive correlations in beta and gamma bands did not exceed this higher alpha level. High-gamma-band amplitudes were positively correlated with memory load in LPFC, posterior parietal, temporal, and insular cortical regions. Importantly, the network of load-dependent high-gamma oscillations was spatially stable throughout the retention period (see supplemental Fig. 7, available at www.jneurosci.org as supplemental material).
The Morlet wavelet analysis approach revealed essentially the same frontoparietal pattern, indicating that the parcellation scheme did not affect the primary results (A = 0.01, FDR corrected) (supplemental Fig. 8, available at www.jneurosci.org as supplemental material). However, the memory load-dependent strengthening of beta- and gamma-band amplitudes in visual regions did not exceed the significance threshold, which is likely to be caused by a greater spectral variability in these bands across the subjects.
Correlation of oscillation amplitudes with memory load and behavioral capacity
The results have so far clearly indicated that the amplitudes of high-alpha-, beta-, gamma-, and high-gamma-band oscillations were positively correlated with memory load in a frontoparietal network (Fig. 5B). To further characterize this phenomenon with post hoc analyses, we estimated the amplitudes in all frequency bands separately for each memory load and individual subject as an average across the cortical regions that were positively correlated with memory load (Fig. 5B). These data corroborated that the amplitudes were indeed memory load dependent during the memory retention period (Fig. 6A). Evaluation of mean amplitudes averaged over the latter half of the retention period (700–1000 ms) showed that oscillation amplitudes were systematically increased with memory load, but with a biphasic shape and a plateau at three to four objects (Fig. 6B). The analysis for the Morlet wavelet-filtered data showed similarly that both the strengths and fraction of significant cortical regions were systematically modulated by the increasing memory load (supplemental Fig. 9A–D, available at www.jneurosci.org as supplemental material).
To address the link between the retention period amplitude modulations and individual behavioral VWM capacity, we used the same approach as above for ERs (Fig. 3). In the bilateral frontoparietal regions, the plateau onsets of the high-alpha-band amplitudes were negatively correlated with memory capacity (Fig. 6C), implying that in high-capacity subjects, the load-dependent alpha strengthening leveled off at lower memory loads than in low-capacity subjects. On the other hand, despite a similar negative trend, neither beta- nor gamma-band amplitudes were significantly correlated with the memory capacity. Interestingly, although the high-gamma-band amplitudes were only weakly correlated with the VWM load, the plateau onsets in high-gamma band were clearly positively correlated with the individual VWM capacity.
We also investigated the correlation of the load-dependent amplitude suppression with memory capacity. We first estimated the relative amplitudes from those cortical regions that were negatively correlated with the memory load (supplemental Fig. 10A, available at www.jneurosci.org as supplemental material). The amplitudes were then averaged over the retention period (supplemental Fig. 10B, available at www.jneurosci.org as supplemental material). Last, we estimated the correlation between the predicted and real capacity values (supplemental Fig. 10C, available at www.jneurosci.org as supplemental material). The amplitude suppression was not, however, significantly correlated with the individual memory capacity in any of the studied frequency bands.
Finally, to corroborate the group analysis approach, we asked whether low- and high-capacity subject populations were associated with qualitatively and phenomenologically similar retention period oscillation dynamics. The subject group was divided into two halves by individual memory capacity, and cortical regions where oscillation amplitudes were correlated memory load were identified with Spearman's rank correlation test (p < 0.01, FDR corrected) as above (Fig. 7). The left-lateralized but bilaterally highly significant frontoparietal high-alpha-, beta-, and gamma-band oscillations were observed in both subject populations, although they were more pronounced for the high-capacity subjects. On the other hand, only the high-capacity group was associated with a significant memory load correlation in the high-gamma band. The apparent absence of memory load-dependent high gamma in the low-capacity group may, however, be associated with the overall poor signal-to-noise ratio in this frequency band. Interestingly, the amplitude suppression in the high-alpha band was less pronounced for the high- than the low-capacity subjects.
Stimulus processing and VWM encoding during early stimulus-locked activity
The processing of sample stimuli was associated with a rapid spread of activity during the first ∼200 ms from occipital to temporal and parietal cortices. These regions are known to underlie the formation of visual object representations (Singer, 1999; Riesenhuber and Poggio, 2002; Grill-Spector and Malach, 2004; Konen and Kastner, 2008). The P1 component (∼120 ms) was linearly strengthened by the VWM load, whereas the magnitude of the N1 component (∼190 ms) plateaued with increasing memory load. This plateau onset was positively correlated with individual behavioral VWM capacity in the early visual regions. Interestingly, during the N1 component, activity in visual regions was accompanied by concurrent activation of frontal and parietal structures. It is thus likely that the stage of stimulus processing and memory encoding already reflected in the N1 component, frontoparieto–sensory interaction, and frontoparietal top-down modulation of early visual regions give rise to individual behavioral memory capacity limits.
Localization of oscillation amplitude dynamics during VWM retention
Evoked stimulus processing was followed by amplitude suppression in posterior, central, and frontal VWM-related cortical areas in frequency bands from 5 to 90 Hz. A similar wide-band amplitude suppression has been observed to follow sensory stimuli in MEG (Palva et al., 2005) and to characterize the VWM retention period in human EEG (Medendorp et al., 2007) and in direct cortical (Axmacher et al., 2007; Meltzer et al., 2008) recordings. In the present and prior VWM MEG/EEG data, increasing memory load strengthened the amplitudes of high-alpha- (Busch and Herrmann, 2003; Leiberg et al., 2006; Grimault et al., 2009; Haenschel et al., 2009), beta- (Leiberg et al., 2006), and gamma- (Tallon-Baudry et al., 1998; Osipova et al., 2006; Haenschel et al., 2009) frequency bands. In the present data, the amplitudes of high-alpha, beta-, gamma-, and high-gamma-frequency bands were positively correlated with memory load in several frontal, parietal, and temporal regions as well as in the cingulate and insular cortices. Prior MEG studies suggest that the positive memory load dependence in the α band originates from occipitoparietal brain regions (Jensen et al., 2002; Osipova et al., 2006; Medendorp et al., 2007), which is in contrast with the frontal sources found here. However, the central difference in these studies is the time window of the effect. In the present study, the retention period effects are observed in a time window of 0.5 to 1 s, and in those cited above, in a time window from 1 to 3 s. In fact, these findings together constitute neurophysiological support for the notion that there is transition in VWM maintenance mechanisms at ∼1 s (Todd and Marois, 2004).
These data thus show load-dependently strengthened oscillation amplitudes in several cortical regions, which have earlier been identified in VWM studies using fMRI (Prabhakaran et al., 2000; Rowe et al., 2000; Munk et al., 2002; Pessoa et al., 2002; Linden et al., 2003; Todd and Marois, 2004). Although these frontal and parietal regions are well recognized cortical substrates for WM function, a number of other areas also have been revealed by fMRI and here with M/EEG. While the functional roles of cingulate, orbital, and insular cortices in VWM have remained unclear, it is intriguing that their multiband amplitude dynamics in the present study were similar to those observed in frontoparietal regions. The cingulate, orbital, and insular structures were also strongly phase synchronized with the frontoparietal networks in these frequency bands in our prior work (Palva et al., 2010b). These results together suggest that in addition to classical VWM regions, these regions also contribute to the VWM maintenance in subsecond time scales.
The functional role of neuronal oscillations in VWM retention
Three aspects of the present data yield insight into the functional roles of oscillatory activity in VWM: memory load dependence, anatomical localization, and correlation with behavior. Frontal and parietal regions are thought to underlie attentional and central executive functions in VWM as indexed by fMRI (Prabhakaran et al., 2000; Rowe et al., 2000; Sakai et al., 2002; Curtis and D'Esposito, 2003; Petrides, 2005) and lesion (Voytek and Knight, 2010) studies. Hence, the frontoparietal localization of the positive memory load dependence of high-alpha, beta, gamma, and high-gamma oscillations implies that they are associated with attentional functions of VWM retention. This finding thus supports earlier suggestions for a role for high-alpha (Klimesch et al., 2007; Palva and Palva, 2007), beta (Gross et al., 2004; Buschman and Miller, 2007), gamma, and high-gamma (Fries, 2009) oscillations in attention. Beta- and gamma-frequency band oscillations were also strengthened over several visual cortical regions, including the inferotemporal and occipitotemporal regions that are associated with visual object processing (Riesenhuber and Poggio, 2002; Grill-Spector and Malach, 2004; Konen and Kastner, 2008), which is in line with the idea that beta- and gamma-band oscillations support the maintenance of object representations in VWM (Tallon-Baudry et al., 1998; Jokisch and Jensen, 2007; Palva et al., 2010b). Thus, in terms of activity localized to sensory brain regions, the present data are in line with the currently prevailing idea of distinct functional roles for alpha vs beta-gamma oscillations. Alpha oscillation amplitude dynamics are often interpreted along the lines of the inhibition hypothesis (Pfürtscheller, 2003; Klimesch et al., 2007), which posits that large alpha amplitudes in a given region reflect the disengagement of that region from task-relevant processing and that alpha-band amplitude suppression reflects a release from inhibition of that region. However, both the overall amplitude suppression and the positive memory load dependence in high-alpha, beta, gamma, and high-gamma bands were largely colocalized in the frontoparietal network, which is difficult to reconcile with the inhibition hypothesis. It is thus possible that the phenomenology and functional roles of alpha and perhaps of also other oscillations depend on the level of cortical hierarchy (Bollimunta et al., 2008).
Neuronal correlates of behavioral VWM capacity
A number of fMRI studies report that activity in IPS and LOC may predict the behavioral VWM capacity (Todd and Marois, 2004; Xu and Chun, 2006; Robitaille et al., 2010). In MEG and EEG, the lateralization of ER components (Vogel and Machizawa, 2004; Robitaille et al., 2010), posterior alpha-band amplitude modulations (Sauseng et al., 2009), and the strength of interareal phase synchrony in alpha and beta bands (Palva et al., 2010b) are also correlated with the individual VWM capacity. In the present study, we found that the plateau in the load-dependent magnitude of the N1 evoked response component in early visual regions was positively correlated with individual capacity. During VWM retention period, the memory load-dependent strengthening of high-alpha-band amplitudes in the frontoparietal regions was negatively correlated with the individual capacity showing that high-alpha amplitude peaked or plateaued at progressively lower memory loads for higher capacity subjects. This suggests that enhanced high-alpha oscillations are associated with attentional control when the number of to-be-memorized items exceeds the individual memory capacity. In contrast, the plateau in the load-dependent strengthening of high-gamma oscillations was positively correlated with individual VWM capacity, which supports the concept that high-gamma oscillations underlie integrated neuronal object representations (Singer, 1999) that are sustained in an active state during VWM maintenance. Accordingly, the memory load dependence of high-gamma oscillations was also more pronounced in high- than in low-capacity subjects. However, the high- and low-capacity subject groups were overall characterized by qualitatively similar patterns of amplitude modulations, which indicates that these data do not warrant strongly dichotomous interpretations on subject classification. Overall, a growing body of evidence demonstrates that the neuronal mechanisms underlying the limited human VWM capacity may be more heterogeneous and complex than previously conceived and that these mechanisms include both encoding- and retention-related neuronal activities.
The relationship between the oscillation amplitudes and interareal synchrony
We have recently shown that interareal 1:1 phase synchrony in alpha, beta, and gamma bands is load-dependently strengthened in a frontoparietal network (Palva et al., 2010b) wherein also the oscillation amplitudes were load-dependently strengthened. This raised the question of whether the synchronization estimates were biased by an improved signal-to-noise ratio associated with the strengthening of amplitudes. The concurrent changes in phase synchrony were, however, much stronger than those predicted by SNR changes (Palva et al., 2010b), suggesting that the positive correlation of frontoparietal oscillation amplitudes with memory load could be a reflection of the colocalized strengthening of short-range phase synchronization. In line with this notion, the network topologies in the high-alpha and beta bands were more clustered and small-world-like than those in theta- or gamma-frequency bands (Palva et al., 2010a). Hence, phase synchrony in dense core-like structures is likely to be positively correlated with local amplitude changes, while in the topologically less clustered brain regions phase synchrony and local amplitude dynamics can be more independent. In line with previous studies (Freunberger et al., 2008; Doesburg et al., 2009; Freunberger et al., 2009), interareal synchronization was enhanced above baseline levels throughout the memory retention period (Palva et al., 2010b), whereas the oscillation amplitudes were suppressed below the baseline. This important phenomenological dichotomy between oscillation amplitudes and synchrony shows that local amplitude modulations and long-range interareal phase synchronization are dissociable phenomena that conceivably reflect local processing vs large-scale interactions.
This study was supported by the Academy of Finland, by the Helsinki University Research Funds, the National Institutes of Health National Center for Research Resources Grant P41 RR14075, and National Institute of Biomedical Imaging and Bioengineering Grants 1R01EB009048 and 1R01 EB006385.
- Correspondence should be addressed to either Satu Palva or J. Matias Palva, Neuroscience Center, P.O. Box 56, University Helsinki, 00014 Helsinki, Finland. or