Abstract
Visual working memory (VWM) recruits a broad network of brain regions, including prefrontal, parietal, and visual cortices. Recent evidence supports a “sensory recruitment” model of VWM, whereby precise visual details are maintained in the same stimulus-selective regions responsible for perception. A key question in evaluating the sensory recruitment model is how VWM representations persist through distracting visual input, given that the early visual areas that putatively represent VWM content are susceptible to interference from visual stimulation.
To address this question, we used a functional magnetic resonance imaging inverted encoding model approach to quantitatively assess the effect of distractors on VWM representations in early visual cortex and the intraparietal sulcus (IPS), another region previously implicated in the storage of VWM information. This approach allowed us to reconstruct VWM representations for orientation, both before and after visual interference, and to examine whether oriented distractors systematically biased these representations. In our human participants (both male and female), we found that orientation information was maintained simultaneously in early visual areas and IPS in anticipation of possible distraction, and these representations persisted in the absence of distraction. Importantly, early visual representations were susceptible to interference; VWM orientations reconstructed from visual cortex were significantly biased toward distractors, corresponding to a small attractive bias in behavior. In contrast, IPS representations did not show such a bias. These results provide quantitative insight into the effect of interference on VWM representations, and they suggest a dynamic tradeoff between visual and parietal regions that allows flexible adaptation to task demands in service of VWM.
SIGNIFICANCE STATEMENT Despite considerable evidence that stimulus-selective visual regions maintain precise visual information in working memory, it remains unclear how these representations persist through subsequent input. Here, we used quantitative model-based fMRI analyses to reconstruct the contents of working memory and examine the effects of distracting input. Although representations in the early visual areas were systematically biased by distractors, those in the intraparietal sulcus appeared distractor-resistant. In contrast, early visual representations were most reliable in the absence of distraction. These results demonstrate the dynamic, adaptive nature of visual working memory processes, and provide quantitative insight into the ways in which representations can be affected by interference. Further, they suggest that current models of working memory should be revised to incorporate this flexibility.
Introduction
Visual working memory (VWM) constitutes the brief maintenance and manipulation of visual information. It allows for a unified representation of the visual world, despite frequent temporal discontinuities in visual input that occur during eye movements, occlusions, and object motion (Curtis and D'Esposito, 2003; Serences, 2016). This cognitive ability is supported by a broad network of brain regions, including lateral prefrontal cortex (Chelazzi et al., 1993; D'Esposito et al., 1995, 2000; Ester et al., 2015; Riley and Constantinidis, 2015), parietal cortex (Christophel et al., 2015; Ester et al., 2015; Bettencourt and Xu, 2016; Xu, 2017), and primary sensory cortices (Harrison and Tong, 2009; Serences et al., 2009). According to a “sensory recruitment” model of VWM (Pasternak and Greenlee, 2005; Postle, 2006; D'Esposito, 2007), precise visual details are maintained in the same stimulus-selective regions responsible for primary visual processing during perception. Indeed, numerous studies have provided evidence that occipital cortex maintains visual features in working memory (Chelazzi et al., 1993; Miller et al., 1993; Magnussen, 2000; Awh and Jonides, 2001; Curtis and D'Esposito, 2003; Pasternak and Greenlee, 2005; D'Esposito, 2007; Harrison and Tong, 2009; Serences et al., 2009; Ester et al., 2013; Pratte and Tong, 2014; Sreenivasan et al., 2014; D'Esposito and Postle, 2015).
However, because early visual cortex is susceptible to interference from subsequent visual input, it remains unclear how VWM representations in these regions could survive visual interference. In fact, multiple studies have found that distractors presented during the memory delay period do impair VWM for simple visual features like spatial frequency (Magnussen et al., 1991; Bennett and Cortese, 1996; Nemes et al., 2011) and motion direction (Magnussen and Greenlee, 1992; Pasternak and Zaksas, 2003; McKeefry et al., 2007), as well as for more complex stimuli like faces (Yoon et al., 2006; Sreenivasan and Jha, 2007). This impairment is particularly evident when the distractor differs from the item in memory along a task-relevant feature dimension (Magnussen et al., 1991; Lalonde and Chaudhuri, 2002; Nemes et al., 2011), suggesting that visual interference likely acts on VWM representations maintained within segregated feature-specific channels in visual cortex (Sneve et al., 2011). Interestingly, interference from a subsequent distractor can exert an attractive pull on VWM representations of spatial frequency (Huang and Sekuler, 2010; Nemes et al., 2011; Dubé et al., 2014), color (Nemes et al., 2012), or orientation (Rademaker et al., 2015; Wildegger et al., 2015).
Recently, Bettencourt and Xu (2016) sought to identify the neural locus of distractor-resistant VWM representations in a human functional magnetic resonance imaging (fMRI) study. They found that remembered orientations could be decoded from the superior intraparietal sulcus (IPS) despite the presentation of irrelevant distractors, and that the discriminability of orientation information in early visual areas was reduced, but not abolished, by distractor presentation. This pattern of results has two equally plausible explanations, with important theoretical consequences for the sensory recruitment model of working memory. First, this pattern of results could suggest that any VWM representations observed in early visual areas are not behaviorally relevant. Alternatively, distraction during VWM could systematically bias early visual representations, as has been observed in behavior (Rademaker et al., 2015; Wildegger et al., 2015), and such biases could account for the observed reduction in orientation discriminability (Ester et al., 2016).
Here, we used an inverted encoding model orientation reconstruction approach to quantitatively assess the effect of distractors on VWM representations in early visual cortex and the IPS. This approach allowed us not only to identify the regions in which visual information is reliably represented during a working memory delay, but also to investigate whether, and how, these representations are affected by subsequent distractors.
Materials and Methods
Participants.
We recruited 21 healthy young adult participants for this study. Nine of these participants were excluded due to an inability to maintain fixation or an inability to remain alert or awake during their initial scan session. The final 12 participants (2 male, ages 18–35), which included two of the authors (E.S.L. and A.RE.V.), each completed a 1 h training session and four 2 h MRI scan sessions. Participants were right-handed and had normal or corrected-to-normal vision, and all procedures were approved by the UC Berkeley Committee for the Protection of Human Subjects. Participants gave their written informed consent before the study and were compensated monetarily for their time. In addition, participants were given small bonuses after each scanning session that scaled with their performance on the cognitive task.
Behavioral training procedure.
Before the MRI scan sessions, each participant completed a 1 h training session to learn and practice the cognitive task, and practice maintaining central fixation. Participants each completed a minimum of two 12-trial runs of the task, during which eye position was continuously monitored (Eyelink-1000, SR Research). Feedback about mean behavioral precision and fixation quality was provided at the end of each practice run. In addition, participants were familiarized with the retinotopic mapping procedures and practiced maintaining visual fixation while performing a challenging peripheral visual detection task, in which they detected the brief appearance of a small gray circle within the polar angle mapping wedge (see Regions-of-interest).
Experimental design and statistical analyses
Cognitive task.
The task (Fig. 1) was designed to assess the precision of VWM for orientation, and to capture any memory biases induced by an intervening distractor. Each trial began with a green 500 ms pre-cue at fixation to warn participants that a new trial was about to start. The pre-cue was immediately followed by a right-lateralized oriented sinusoidal grating, and participants were instructed to remember the orientation of this grating as precisely as possible over the remainder of the trial. Because of hemispheric asymmetries in the representation of visual space in the early visual (Hougaard et al., 2015) and parietal (Sheremata et al., 2010; Berger et al., 2014; Jeong and Xu, 2016) cortices, we chose to present all stimuli in the right visual hemifield, rather than counterbalance stimulus hemifield across participants. The grating was presented for 500 ms, and followed by a 9.5 s blank delay (“Delay 1”). On two-thirds of trials, a distractor grating, which participants were instructed to ignore, was then presented for 500 ms in the same spatial location as the memory stimulus. On the remaining one-third of trials, the screen remained blank with only a fixation point during this 500 ms interval. Finally, there was a second 9.5 s blank delay (“Delay 2”) after which participants were presented with a (randomly oriented) test grating that they had to adjust to the memorized orientation using an MR-compatible joystick (Current Designs) within a 4 s response window. Trials were separated by a variable intertrial interval of 8, 10, or 12 s.
Although distractors were more likely to appear than a blank delay, distractor presentation was unpredictable on any given trial. The distractor consisted of a sinusoidal grating with an orientation that was 40–50° clockwise (50%) or counterclockwise (50%) of the remembered orientation. With the exception of two participants (E.S.L. and A.R.E.V.), participants were not informed of the relationship between the memory and distractor stimulus orientations. In addition, the task timing was slightly different for these first two participants (see fMRI acquisition and preprocessing), in that there was a 3 s response window, and the intertrial interval was 6.67, 8.33, or 10 s.
Memory orientations were distributed over the entire 180° orientation space; eight equally spaced “base” orientations were chosen (0–157.5°, in steps of 22.5°), with an additional positive or negative 1–10° of jitter added on each trial. Grouping the orientations into bins allowed us to ensure that similar sets of orientations were shown in each of the three distractor conditions. To limit fMRI run lengths, the eight orientation categories were randomly split into two sets for each participant, with one set shown on odd runs and the other on even runs. There were 12 trials per run: one trial from each of four orientation categories for each of the three distractor conditions. Participants completed between 22 and 32 runs (participants always completed an even number of runs for counterbalancing purposes), yielding an average of 108 trials in each distractor condition. Participants were given feedback at the end of each run about the mean precision of their responses.
Stimuli were programmed in MATLAB (v2012b, MathWorks) using the Psychtoolbox (Brainard, 1997; Pelli, 1997; Kleiner et al., 2007) projected onto a screen at the rear of the magnet bore and viewed via a head coil-mounted mirror. Each circular sinusoidal grating stimulus subtended 10° of visual angle and was centered on a point 7° of visual angle to the right of fixation. The sinusoidal gratings were full contrast, with a spatial frequency of 0.5 cycles/°, and they alternated in phase at a rate of 12 Hz for the duration of their presentation. Participants were required to maintain central fixation throughout each scanning run, and eye position was monitored with an MR-compatible Eyetracker (Avotec).
Behavioral analyses.
The method-of-adjustment response (see Cognitive task) yielded a trial-by-trial measure of memory error in degrees for each distractor condition (no distractor, clockwise distractor, counterclockwise distractor), for each participant. Because we did not have an a priori expectation for any differences between the clockwise and counterclockwise distractor conditions, we flipped the signs of the errors from the counterclockwise distractor condition and combined them with those from the clockwise distractor condition before model fitting, to increase analysis power. The group data with separate model fits for counterclockwise and clockwise distractor trials are provided as Extended Data, as are the individual subject error histograms and model fits.
First, to examine whether participants' mean response errors reliably differed between the no-distractor and distractor conditions, we calculated the circular distance between the circular means of the no-distractor and distractor error distributions for each participant (Berens, 2009). Then, we calculated circular 95% confidence intervals to determine whether the resulting set of no-distractor/distractor differences significantly differed from zero.
To assess whether the distractors selectively caused an attractive bias, or whether they also affected memory precision or guess rates, we followed this initial analysis with a more detailed mixture model analysis. Using the MemToolbox (Suchow et al., 2013) in MATLAB, each error distribution was fit with a mixture model of a von Mises distribution and a uniform distribution. As in previous studies (Zhang and Luck, 2008), three free parameters were estimated: the mean of the von Mises (reflecting any systematic clockwise or counterclockwise biases in participants' responses), the SD of the von Mises (reflecting the average precision of a participant's responses), and the height of the uniform distribution (reflecting the rate of random guesses). This model was fit separately for each condition for each of the 12 participants, using Markov Chain Monte Carlo to compute maximum a posteriori parameter estimates and 95% credible intervals (the Bayesian analog of confidence intervals). We compared the size of the resulting credible intervals across distractor conditions to assess whether the quality of model fit differed by condition. Finally, we performed paired samples t tests to compare estimated parameters between the no-distractor and distractor conditions.
fMRI acquisition and preprocessing.
MRI data were acquired in the UC Berkeley Henry H. Wheeler, Jr. Brain Imaging Center with a Siemens TIM/Trio 3T MRI scanner with a 12-channel receive-only head coil. Whole-brain MPRAGE T1-weighted scans were acquired for anatomical localization, normalization, and cortical surface reconstruction. Functional data were obtained using a one-shot T2*-weighted echoplanar imaging (EPI) sequence sensitive to blood oxygenation level-dependent (BOLD) contrast. The EPI sequence parameters for the first two participants (TR = 1.6667 s, TE = 30 ms, flip angle = 55°, field-of-view = 1110 × 1110, matrix size = 74 × 74, in-plane resolution = 3 × 3 mm, 25 ascending 3-mm-thick axial slices separated by a 0.3 mm interslice gap) were slightly different from the sequence used for the remainder of the participants (TR = 2.0 s, TE = 30 ms, flip angle = 55°, field-of-view = 1332 × 1332, matrix size = 74 × 74, in-plane resolution = 3 × 3 mm, 30 ascending 3-mm-thick axial slices separated by a 0.3 mm interslice gap), as minor adjustments were made to improve whole-brain coverage.
Functional MRI data were preprocessed with AFNI (Cox, 1996) and custom MATLAB (v2012b, MathWorks) scripts. Following slice-time correction, each EPI run was motion-corrected and coregistered to the anatomical scan in a single resampling step, using a six-parameter affine registration in align_epi_anat.py (Saad et al., 2009). Finally, each run was linearly detrended, voxelwise, in preparation for inverted encoding model orientation reconstruction.
Regions-of-interest.
We chose as our regions-of-interest (ROIs) the left and right early visual areas and the left and right IPS. Because the stimuli were always presented in the right hemifield, the left hemisphere was always contralateral to the memory stimulus, distractor, and probe, and the right hemisphere was always ipsilateral. We limited our early visual ROIs to V1–V3, because decoding of orientation information in VWM is of comparable quality for V1 through V3, and somewhat less robust for V3A/V4 (Harrison and Tong, 2009; Pratte and Tong, 2014).
Gray/white matter boundary segmentation and cortical surface reconstruction was performed with FreeSurfer's recon-all tool (Fischl et al., 2002; http://surfer.nmr.mgh.harvard.edu/fswiki/recon-all). Retinotopic mapping was used to delineate early visual areas V1–V3 (see Retinotopic mapping of early visual areas V1–V3), and the anatomical IPS ROIs were defined on the cortical surface using the automated “S_intrapariet_and_P_trans” label from the Destrieux atlas (Destrieux et al., 2010) in FreeSurfer. Finally, all surface-based ROIs were transformed back into native volume space for subsequent analyses.
Retinotopic mapping of early visual areas V1–V3.
During the first scanning session, participants completed six 6.5 min retinotopic mapping functional runs: two each of clockwise- and counterclockwise-rotating polar angle mapping runs, and two eccentricity mapping runs. For the polar angle mapping runs, a 40° black and white checkerboard wedge (phase-alternating at 8 Hz) rotated around fixation. A complete revolution was completed every 40 s, and 10 revolutions were completed per run. For the eccentricity mapping runs, a checkerboard ring stimulus expanded slowly from fixation, with a full cycle completed every 40 s, 10 times per run. Participants were instructed to maintain central fixation throughout the run, and to press a button when a small gray circle appeared in a random location within the wedge or ring.
The functional retinotopy data were processed with AFNI's @Retino_Proc script (Warnking et al., 2002). Then, left and right visual areas V1, V2, and V3, dorsal and ventral, were individually delineated on the cortical surface in AFNI's SUMA (Saad et al., 2004; Saad and Reynolds, 2012), following standard procedures (Wandell and Winawer, 2011).
Inverted encoding model analyses.
To reconstruct orientation information from multivoxel patterns of BOLD activity, we adopted an inverted encoding model approach, which has also been termed “forward encoding” (Brouwer and Heeger, 2009; Ester et al., 2013). All of the following analyses were performed separately within each ROI (left and right V1–V3, left and right IPS).
First, for every trial, we extracted groups of volumes representing stimulus perception (4–6 s after stimulus onset), the first memory delay (10–14 s after stimulus onset), and the second memory delay (20–26 s after stimulus onset), and we averaged each group to yield a single BOLD intensity pattern for each trial epoch. For the analysis of perception/encoding and first memory delay representations, the inverted encoding model analysis was completed separately within each epoch, using a leave-one-run-pair-out cross-validated structure. A pair of runs were held out on each iteration, because only half of the orientation categories were presented on each run; this ensured that the training and testing datasets contained examples that were evenly distributed across the entire orientation space. To reconstruct distractor perception and the second memory delay, the encoding model was trained on the BOLD patterns from the first memory delay. Training the model on pre-distractor time points allowed for direct comparison between reconstructed representations with and without distractors. In addition, this method avoided any biases that could have been introduced by estimating the model on time points containing a mixture of distractor conditions.
After averaging the data, we characterized the orientation preferences of each voxel in an ROI by creating a voxel-by-voxel encoding model. More specifically, we used linear regression (Eq. 1) to calculate the weights (W: 8 channels × m voxels) that related the BOLD intensities from each voxel in the training set (B1: m voxels × n trials) to the modeled activation of eight hypothetical orientation channels (C1: 8 channels × m voxels), given the orientations that were presented on each trial. These hypothetical orientation basis functions were half-wave rectified sinusoids raised to the seventh power and distributed evenly from 0° to 179°. Next, we inverted the voxel-by-voxel encoding model (Eq. 2), and the calculated weights (W) were used to determine the output of each of the hypothetical orientation channels (C2), given the BOLD activity patterns in the held-out subset of trials (B2). Finally, for every possible orientation (0° to 179°), we calculated the Pearson's correlation between the reconstructed orientation channel output (C2) and the eight hypothetical orientation channels (Brouwer and Heeger, 2009). The resulting orientation reconstruction function was therefore maximal at the orientation that had most likely been represented on that trial, based on the measured pattern of BOLD activity. To evaluate whether reliable orientation information could be reconstructed across trials, we recentered each single-trial orientation reconstruction function so that the “correct” orientation on that trial corresponded to 0°, and we then averaged across trials to create an average orientation reconstruction function for that ROI in that participant. Finally, each reconstruction function was then averaged across participants within an ROI.
Reconstruction significance was calculated with a permutation procedure, in which the quality of each reconstruction was quantified with a “representational fidelity” metric (RF; Sprague et al., 2016). More specifically, we calculated the vector mean across the entire group-averaged orientation reconstruction (Eq. 3), where r(θ) is the mean Pearson's correlation at a given polar angle, spanning all −90° to 89° of zero-centered orientation space. Next, we created a null distribution of representational fidelity values by repeating the above orientation reconstruction analysis for 1000 permutations of the remembered orientation trial labels (within the true run structure of the data) and again applying Equation 3. Finally, we calculated the fraction of samples in the null distribution whose representational fidelity was larger than the observed representational fidelity, which yielded an empirical p value. Across all reconstructions for which we calculated representational fidelity significance, and comparisons between those reconstructions, we corrected for multiple comparisons by controlling the false discovery rate (FDR; Benjamini, 2001), q = 0.05.
To compare reconstruction quality between ROIs, we simply extended the above procedure to perform a two-tailed test between regions. First, we calculated the absolute value of the difference between the ROIs' representational fidelity values. Then, we performed the same calculation on each of the 1000 sets of permuted results (see previous paragraph). Finally, the p value was calculated as the fraction of this null distribution that was larger than the observed difference.
Testing for biases in reconstructed orientation representations.
To assess how orientation representations changed post-distractor, we examined the average orientation reconstruction functions during the second memory delay, separately for each participant and ROI.
First, we computed the slope of the best-fit line to the mean reconstruction error sizes from the three distractor conditions (counterclockwise-none-clockwise). Because this analysis simultaneously incorporates both counterclockwise and clockwise distractor reconstructions, we did not pool the counterclockwise and clockwise distractor data here. The resulting slope would be positive if the distractors exerted an attractive effect on the orientation representations, zero if there were no effect, and negative if the distractors exerted a repulsive effect. Next, we generated a null distribution by randomly flipping the signs of the individual participant slopes and averaging across participants, repeating this 1000 times. Finally, we tested for the predicted positive slope by comparing the observed mean slope across participants to this null distribution.
Last, to evaluate whether the quality of the reconstructions were comparable with and without distractors, regardless of any bias in the reconstruction center, we performed a modified version of the above reconstruction fidelity analysis. First, rather than using a zero-centered cosine function in Equation 3, we used a series of cosine functions centered at each of the 180 possible orientation centers and took the maximum (hereafter “maximal fidelity”). Then, for each ROI, we calculated the absolute value of the difference in maximal fidelities between the no-distractor and distractor reconstructions. Finally, we repeated this analysis for each of the 1000 null reconstructions and calculated the p value as the fraction of these null maximal fidelity differences that exceeded the size of the empirical maximal fidelity difference.
Reconstructing orientation representations over time.
Finally, we conducted an exploratory analysis to examine the temporal evolution of the VWM representations over the course of the trial, both with and without intervening distractors. This analysis closely resembled the orientation reconstruction approach described above, in that the encoding model was always estimated based on the two TRs at the end of the first memory delay (10–14 s poststimulus onset). However, the resulting voxel weights were then used to create orientation reconstruction functions within a two-TR sliding window moved one TR at a time over the entire trial.
Results
Behavior
Before each of the following behavioral analyses, we sign-flipped the data from the counterclockwise distractor trials and combined it with the clockwise distractor trials, to yield a single distribution of “distractor” errors to compare against the “no distractor” error distribution within each participant. In this combined distractor data, an attractive bias would then appear as a clockwise shift.
First, to test for an overall difference in the mean errors between trials with and without distractors, we calculated circular 95% confidence intervals around the mean circular distance between the no-distractor and distractor mean error directions. Because the resulting confidence interval did not include zero [mean = 1.042°, circular 95% CI (0.060, 2.025)], this suggests that the mean behavioral responses were shifted toward the distractor orientations.
We followed this coarse analysis with a more specific model-based analysis to examine which parameters of participants' response distributions were affected by distraction. Using maximum likelihood estimation, we fit each participant's response error distribution with a mixture of a von Mises and a uniform distribution (Zhang and Luck, 2008), separately for trials with and without distractors. Three parameters of these distributions were estimated and then compared across distractor conditions: the mean and SD of the von Mises reflect the mean bias and precision of memory representations, respectively, whereas the height of the fitted uniform distribution captures the guess rate. Sample no-distractor and distractor response error distributions and resulting model fits for an example participant are depicted in Figure 2A, and the same is shown for all participants in Extended Data Fig. 2-1.
Figure 2-1
Figure 2-2
A paired-samples t test revealed that the mean bias (Fig. 2B, left) varied by distractor condition (t(11) = 2.336, p = 0.039), in that participants showed a significant positive (attractive) bias of about one degree toward presented distractors, but no bias on distractor-free trials. In contrast, there was no effect of distractors on mean guess rate (Fig. 2B, center; t(11) = 0.498, p = 0.628) or on precision (Fig. 2B, right; t(11) = 0.535, p = 0.603). The model fits were of significantly higher quality for the distractor trials compared with the no-distractor trials (as indexed by the size of the 95% credible intervals around each parameter estimate), because the former included twice the number of trials per participant (repeated-measures ANOVA, main effect of distractor condition F(1) = 31.848, p < 0.001).
Similar effects were found when data from the counterclockwise and clockwise distractor conditions were fitted separately (Fig. 2-2), although a reliable bias was only found in the counterclockwise distractor condition.
Orientation reconstruction during perception and pre-distractor memory delay
Using an inverted encoding model, we observed robust reconstruction of perceived orientations (Fig. 3, left column) in the contralateral (left hemisphere) early visual areas (RF = 0.026, p < 0.001), as well as in the ipsilateral early visual areas (RF = 0.013, p = 0.002). Reconstruction accuracy was superior in contralateral relative to ipsilateral early visual areas (two-tailed p = 0.003). In addition, perceived orientations could be reliably reconstructed from activity patterns in the anatomical contralateral IPS ROI (RF = 0.010, p < 0.001), with comparable fidelity to the contralateral V1–V3 (two-tailed p = 0.348). Although the ipsilateral IPS ROI also yielded accurate orientation reconstructions (RF = 0.004, p = 0.012), they were of significantly lower fidelity than in the contralateral V1–V3 (two-tailed p = 0.015).
At the end of the first memory delay, 10–14 s post-encoding, we found reliable orientation reconstructions in both contralateral and ipsilateral V1–V3 (RFs = 0.018, both p < 0.001), and in the contralateral IPS ROI (RF = 0.013, p = 0.001). The average quality of the reconstructions did not differ between the contralateral and ipsilateral early visual areas (two-tailed p = 0.938) or between the contralateral early visual areas and IPS (two-tailed p = 0.386). In contrast, we did not observe reliable orientation representations in the ipsilateral IPS during this time-point (RF = 0.001, p = 0.409). Therefore, we did not include the ipsilateral IPS in subsequent post-distractor analyses, in which the encoding model was trained on data from this pre-distractor memory delay.
Effect of distractors on subsequent VWM reconstructions
To examine whether VWM representations were unaffected, disrupted, or biased by the distractor orientations, we applied a similar inverted encoding model orientation reconstruction approach. However, instead of training and testing within a single time point, we trained the encoding model on the end of the first memory delay (10–14 s poststimulus), and separately reconstructed orientation representations for each of the three distractor conditions, based on an average activity pattern from the three time-points at the end of the second delay (20–26 s poststimulus). To increase our power to observe distractor effects, we mirrored the resulting orientation reconstruction functions from the counterclockwise distractor trials and averaged them with the clockwise distractor reconstructions. Therefore, in the “distractor” reconstruction plots, the correct orientation is centered at 0°, with the distractor at positive 40–50°.
On trials without distractors (Fig. 4, left column), remembered orientations could be reliably reconstructed in the contralateral (RF = 0.020, p = 0.009) and ipsilateral (RF = 0.014, p = 0.041) early visual areas at the end of the second memory delay, although reconstructions in ipsilateral V1–V3 did not exceed the FDR-corrected threshold of p = 0.023. However, there was no difference in reconstruction accuracy between the contralateral and ipsilateral early visual areas (two-tailed p = 0.430). Finally, orientation information in VWM could not be reliably reconstructed from the contralateral IPS ROI at this time point on the no-distractor trials (RF = 0.005, p = 0.264), although reconstruction quality in the contralateral IPS did not significantly differ from that in the contralateral early visual ROI (two-tailed p = 0.129).
Figure 4-1
In contrast, after the presentation of a distractor (Fig. 4, right column), reconstructions from the contralateral early visual areas were not reliably centered on the correct orientation (RF = 0.006, p = 0.133). There were, however, accurate reconstructions in the contralateral IPS (RF = 0.013, p = 0.023), and a trend toward accurate reconstructions in the ipsilateral V1–V3 (RF = 0.008, p = 0.049, does not exceed FDR threshold of p = 0.023), although neither significantly differed in accuracy from the contralateral early visual area reconstructions (ipsilateral V1–V3: two-tailed p = 0.685; contralateral IPS: two-tailed p = 0.344). Separate reconstructions for the counterclockwise and clockwise trials are presented in Fig. 4-1.
Testing for attractive biases in reconstructed orientation representations
The failure to observe accurate reconstructions in early visual areas post-distraction could be due to distractor-related biases in early visual representations. Because the following analyses simultaneously incorporate the data from the counterclockwise and clockwise distractor conditions in examining the slope across counterclockwise, no-, and clockwise distractor conditions, we did not explicitly pool the distractor data in this analysis.
First, to assess whether reconstructed VWM representations from each ROI were biased toward the previously presented distractor orientations, as found in behavior, we calculated the orientation at which each participant's orientation reconstruction function was maximal, separately for the counterclockwise, no-, and clockwise distractor conditions. Then, we calculated the circular mean and SE of these values across participants for visualization purposes (Fig. 5A). We found a linear effect of distractor condition on reconstruction error size in the contralateral early visual ROI (Fig. 5A, left), such that reconstructions were biased an average of 15 degrees toward the distractors, but centered near zero when no distractors were presented. We found a similar effect of distractor condition on reconstruction bias in the ipsilateral early visual ROI (Fig. 5A, center). No effect of distractor condition on reconstruction bias was found in the contralateral IPS ROI (Fig. 5A, right).
To quantify these effects, we computed, for each participant and ROI, the slope of the best-fit line to the means of the trialwise reconstruction errors in the counterclockwise distractor, no distractor, and clockwise distractor conditions. Then, because the measured slopes in the contralateral and ipsilateral early visual areas were correlated across participants (Spearman's ρ = 0.692, p = 0.016), we averaged the slopes from these ROIs to yield a single “early visual” bias value per participant. Finally, we tested for the predicted positive slope by comparing the observed mean slope across participants to a null distribution generated by randomly permuting the signs of the individual participant slopes (see Materials and Methods). We found that there was an attractive biasing effect of the distractors on the VWM representations in the early visual ROIs (Fig. 5B, left; p = 0.041), but not on the representations in the contralateral IPS (Fig. 5B, right; p = 0.858; early visual vs IPS p = 0.019).
Finally, to assess whether post-distractor orientation reconstructions were of comparable quality to those on trials without distractors, regardless of the observed biases, we performed a subsequent reconstruction fidelity analysis that did not require the representation to be centered at zero (see Materials and Methods). Representations were of comparable fidelity with and without distractors in all three ROIs (contralateral V1–V3: p = 0.276; ipsilateral V1–V3: p = 0.831; contralateral IPS: p = 0.281, all two-tailed comparisons).
Exploratory time-resolved orientation reconstructions
To examine how the orientation representations evolved over time in each ROI, we repeated the above analysis within a two-TR sliding window over the entire course of the trial. Because the first two participants had a shorter TR than the remaining 10 (1.667 vs 2 s), their data were not included in this descriptive analysis.
On trials without distractors (Fig. 6, left column), the orientation in memory could be reliably reconstructed in the contralateral V1–V3 ROI for the majority of the trial. Although weaker, a similar pattern was observed in the ipsilateral V1–V3 ROI. In contrast, the contralateral IPS ROI only showed reliable orientation coding during the first half of the trial.
On trials with distractors (Fig. 6, right column), orientation information could be reconstructed from all three ROIs in the first half of the trial. However, after distractor presentation, representations in both contra- and ipsilateral V1–V3 shifted toward the distractor orientation. Reliable information was observed both before and after the distractor period in the IPS, although an apparent shift in the representation during the time of distractor presentation may correspond to transient processing of the distractor.
Discussion
A significant puzzle that remains for models of VWM is how precise working memory representations persist through subsequent perceptual input. To provide insight into this question, we applied an inverted encoding model orientation reconstruction approach to multivoxel fMRI activity patterns during a delayed estimation task for orientation, with intervening distractors. Given that distractors that differ from a remembered stimulus along a task-relevant feature dimension have been shown to systematically bias VWM reports (Rademaker et al., 2015), we examined early visual cortical and intraparietal sulcus VWM representations during perception and during VWM maintenance, both before and after distraction. The use of an inverted encoding model allowed us to characterize VWM representations at each of these stages of the task, to examine when and where reliable orientation representations persisted, and to test whether VWM representations in these brain regions exhibited the distractor-related biases that have been observed behaviorally.
In line with previous results (Ester et al., 2009; Harrison and Tong, 2009; Serences et al., 2009), we were able to reconstruct reliable orientation information from visual cortex (V1–V3) during both perception and VWM maintenance. Orientations held in VWM can be decoded from both ipsilateral and contralateral visual cortex (Ester et al., 2009), although decoding can also be specific to the contralateral hemisphere if the task demands that the representation be spatially bound (Pratte and Tong, 2014). Indeed, we found that while perceptual reconstructions showed the expected retinotopic bias, orientation information was represented with comparable quality in both hemispheres of early visual cortex during VWM maintenance. In addition, orientation information could be reconstructed from the IPS contralateral to the remembered stimulus, both at perception and during VWM maintenance (Ester et al., 2015). Finally, the IPS ipsilateral to the remembered stimulus also showed reliable orientation coding during perception, but this representation was significantly less reliable than those in the early visual areas and did not persist during VWM maintenance.
We next tested whether an oriented grating presented in the same spatial location as the remembered stimulus would (1) bias behavior and (2) bias or disrupt the observed VWM representations. First, at the behavioral level, we found that distractor presentation did not decrease mean VWM precision or increase the frequency of random guesses, but that the reported orientation was biased about one degree toward the orientation of the distractor. Similarly, at the neural level, we found that orientation representations in contralateral V1–V3 were significantly biased toward distractor orientations. In contrast, the orientation representations in the IPS showed no evidence for a distractor bias and continued to represent the remembered orientation after visual interference. On trials in which distractors were not presented, VWM representations persisted through the end of the trial in the early visual areas, but not the IPS. This pattern of results is in contrast to previous findings (Bettencourt and Xu, 2016), which suggested that superior IPS maintains VWM representations for orientation with and without distraction, and that early visual areas maintain VWM representations only when distraction is unpredictable. Rather, our data suggest that precise stimulus-specific VWM representations for orientation are maintained in the early visual areas, and the involvement of IPS during VWM dynamically adapts to task demands. We found that orientation information was maintained redundantly in the early visual areas and in the IPS in anticipation of possible distraction, and it persisted in the early visual areas (and less-so in the IPS), if a distractor did not appear. However, whereas these early visual VWM representations were susceptible to interference, the IPS representations were distractor resistant. It could be, therefore, that the difference we observed between the magnitudes of the neural and behavioral biases is a result of this adaptive IPS involvement.
The finding that content-specific VWM representations can be redundantly maintained in the visual and parietal cortices, and that those representations are differentially affected by distraction, underscores the flexibility of the underlying neural mechanisms supporting VWM. The set of regions primarily involved in VWM maintenance, and the dominant representational scheme (for review, see Serences, 2016), may vary both as a function of task demands and of expectations regarding the potential for distraction (Bettencourt and Xu, 2016). In our task, participants knew that distractors were likely, and that they would appear in the same spatial location as the remembered stimulus. This may have encouraged both a spreading of the representation to ipsilateral early visual areas and a preparatory engagement of the IPS. Interestingly, it was not the case that early visual areas switched entirely to coding the distractor orientations. Rather, distractor processing resulted in biased representations that were intermediate to the remembered and distractor orientations. Although we do not have the cortical layer-specific measurements to fully disentangle early visual area-intrinsic maintenance from that supported by tonic top-down input from prefrontal (Mendoza-Halliday et al., 2014) or parietal cortices, it is possible that feedback is essential for reducing the influence of task-irrelevant distractors on precise VWM representations.
Finally, future studies should investigate whether the observed dynamic tradeoff between the early visual areas and the IPS is the result of an explicit strategy shift or whether it is a more automatic process. For instance, it is possible that orientation information is by default redundantly coded in VWM in the early visual areas and the IPS (although perhaps in a more spatial code in the IPS). This IPS representation could then be drawn upon for future behavior if more sensory-based codes have been degraded by visual interference. Further, it could be that the small behavioral biases that have been observed as a result of distraction reflect a weighted averaging of the VWM signals from early visual areas and IPS, informed by a metacognitive readout of VWM representational uncertainty. Indeed, participants show greater behavioral biases in motion perception (Vintch and Gardner, 2014) and memory for spatial frequency (Dubé et al., 2014) under conditions of increased uncertainty. In addition, sensory uncertainty decoded from visual cortex is predictive of a common perceptual bias in which orientations are perceived as biased away from the cardinal axes (van Bergen et al., 2015). Thus, future studies should examine whether the combination of uncertainty and distractor-related biases decoded from early visual areas and/or IPS during VWM maintenance can predict trial-by-trial behavioral responses.
Footnotes
This work was supported by NIH Grant MH63901 to M.D., PIOF-GA-2013-624380 to A.R.E.V., and pre-doctoral NRSA fellowship NIMH F31MH107157 to E.S.L, as well as the National Science Foundation Major Research Instrumentation Program, award number BCS-0821855. We thank J. Tseng for assistance with participant training and data collection.
The authors declare no competing financial interests..
- Correspondence should be addressed to Elizabeth S. Lorenc, Helen Wills Neuroscience Institute, University of California, Berkeley, 132 Barker Hall, Berkeley, CA, 94720-3190. elizabeth.lorenc{at}berkeley.edu