Abstract
Expectations generated by predictive cues increase the efficiency of perceptual processing for complex stimuli (e.g., faces, scenes); however, the impact this has on working memory (WM) and long-term memory (LTM) has not yet been investigated. Here, healthy young adults performed delayed-recognition tasks that differed only in stimulus category expectations, while behavioral and functional magnetic resonance imaging data were collected. Univariate and functional-connectivity analyses were used to examine expectation-driven, prestimulus neural modulation, networks that regulate this modulation, and subsequent memory performance. Results revealed that predictive category cueing was associated with both enhanced WM and LTM for faces, as well as baseline activity shifts in a face-selective region of the visual association cortex [i.e., fusiform face area (FFA)]. In addition, the degree of functional connectivity between FFA and right inferior frontal junction (IFJ), middle frontal gyrus (MFG), inferior frontal gyrus, and intraparietal sulcus correlated with the magnitude of prestimulus activity modulation in the FFA. In an opposing manner, prestimulus connectivity between FFA and posterior cingulate cortex, a region of the default network, negatively correlated with FFA activity modulation. Moreover, whereas FFA connectivity with IFJ and the precuneus predicted enhanced expectation-related WM performance, FFA connectivity with MFG predicted LTM improvements. These findings suggest a model of expectancy-mediated neural biasing, in which a single node (e.g., FFA) can be dynamically linked or disconnected from different brain regions depending on prestimulus expectations, and the strength of distinct connections is associated with WM or LTM benefits.
Introduction
Efficient cortical processing of complex visual stimuli (e.g., faces, scenes) is aided by expectations of their appearance (Puri and Wojciulik, 2008). Thus, optimal performance requires that we interact with our environment in a flexible and forward-looking manner, using cues to prepare neural systems before an event occurs. For example, valid category cueing enhances the speed and accuracy by which stimuli are detected and discriminated (Puri and Wojciulik, 2008; Puri et al., 2009; Esterman and Yantis, 2010). Given that perception influences subsequent memory (Berry et al., 2010; Rutman et al., 2010) and that these two processes involve overlapping neural resources (Rees et al., 2002; Naghavi and Nyberg, 2005), we hypothesized that expectations of the appearance of stimuli from a known category would result in memory benefits. Predictive cues have been demonstrated to enhance spatial working memory (WM) (Schmidt et al., 2002), but the effects of category expectation on memory [WM or long-term memory (LTM)] of complex stimuli have not yet been evaluated. In the current study, we explored whether stimulus category expectation enhances WM and LTM of complex stimuli.
Studies using functional magnetic resonance imaging (fMRI), electroencephalography, and single-unit recordings offer converging evidence that a fundamental mechanism by which expectation influences performance is via activity modulation in sensory cortical areas before stimulus presentation (i.e., baseline shifting). Prestimulus activity modulation in visual cortex has been shown to occur after many types of predictive cues, including spatial (Kastner et al., 1999), feature (Chawla et al., 1999; Shulman et al., 1999; Giesbrecht et al., 2006), and object category (Puri et al., 2009) cues. It is believed that this process biases activity in brain regions that represent goal-relevant information, such that attended stimuli are afforded a processing advantage (Luck et al., 1997; Chelazzi et al., 1998; Kastner et al., 1999). Indeed, prestimulus activity enhancement has been shown to correlate with perceptual performance in humans (Ress et al., 2000; Sapir et al., 2005; Giesbrecht et al., 2006), although a single-unit study in visual cortex of macaques that examined the correlation between trial-by-trial signal variations preceding a stimulus and performance accuracy did not reveal the same relationship (Cook and Maunsell, 2002). Thus, the impact of prestimulus activity modulation on behavior is still a matter of debate; however, it is possible that univariate measures are not ideal to examine this complex relationship.
According to the prevailing model, sensory cortical activity biasing during periods of prestimulus expectation is mediated via top-down control from a frontoparietal attention network, which includes the intraparietal sulcus (IPS), superior parietal lobule, dorsal supramarginal gyrus (SMG), frontal eye fields (FEFs), and middle frontal gyrus (MFG) (Moran and Desimone, 1985; Desimone and Duncan, 1995; Ungerleider et al., 1998; Kastner et al., 1999; Shulman et al., 1999; Hopfinger et al., 2000; Corbetta and Shulman, 2002; Bressler et al., 2008; Summerfield and Egner, 2009; Esterman and Yantis, 2010). However, the vast majority of evidence for this view is indirect, based on univariate analysis of neuroimaging data in humans and single-unit recordings in animals, in which activity in brain regions are evaluated independently. Functional connectivity analysis (McIntosh, 1998) [e.g., β-series correlation (Gazzaley et al., 2004; Rissman et al., 2004; Clapp et al., 2010; Zanto et al., 2010)], which capitalizes on trial-to-trial variability to examine the relationship between brain regions during a given task and task stage, provides a powerful approach for evaluating functional networks that contribute to activity biasing in sensory cortices and may serve as a more robust correlate of performance benefits.
In the current study, fMRI data were collected while young adults performed delayed-recognition tasks that differed only in instructions informing them of the category of complex stimuli to be remembered (predictive tasks: participants knew whether the stimulus would be a face or a scene; neutral tasks: they did not know whether it would be a face or a scene). To examine the neural and behavioral consequences of predictive category expectancy, we used functional connectivity analysis to assess the role of frontoparietal networks on prestimulus visual cortical activity modulation, as well as subsequent WM and LTM performance.
Materials and Methods
Participants.
Twenty healthy young adults (mean age, 23.4 ± 3.06 years; range, 18–28 years; eight males) gave written informed consent to participate in and completed this study, which was approved by the University of California, San Francisco, Committee for Human Research. Two participants' data were removed from the group analysis because of excessive motion artifacts. All participants had normal or corrected-to-normal vision and were screened to ensure they had no history of neurological, psychiatric, or vascular disease, were not depressed, and were not taking any psychotropic medications. All participants had a minimum of 12 years of education.
Task design.
Stimuli consisted of grayscale images of faces and natural scenes presented on a black background (Fig. 1). The face stimuli consisted of a variety of neutral-expression male and female faces across a wide age range. Hair and ears were digitally removed from the images and a blur was applied along the contours of the face to remove any non-face-specific features. Each stimulus was used in only one trial per experimental session. Individual face and scene stimuli were randomized to different conditions across participants to assure that potentially distinctive stimuli did not confound a particular condition. Images were 225 pixels wide and 300 pixels tall (14 × 18 cm), subtended 3 degrees of visual angle from fixation, and were presented foveally.
The experiment used three delayed-recognition conditions: stimulus-known faces (SKf trials, predictive), stimulus-known scenes (SKs trials, predictive), and stimulus-unknown (SUf and SUs trials, neutral), and a passive-view (PVf and PVs trials, passive) no-expectation baseline task. The SUf and SUs conditions are considered “neutral” in that the participants were expecting to perform a WM task on the forthcoming stimuli, but were not biased toward either of the two categories. This is comparable with the use of the term “neutral” in the Posner spatial cueing paradigm (Posner, 1980) and does not signify “nonpredictive.” Equal numbers of predictive, neutral, and passive-view trials for each object category (faces and scenes) were presented. Participants were given detailed instructions and underwent several practice trials immediately before the scanning session.
At the initiation of each WM block, participants were presented with the 100% predictive instruction of “remember the face” or “remember the scene” (SKf or SKs) or the neutral instruction of “remember the face or the scene” (SU). A 6 s expectation period was signaled by a gray-to-green color change of a fixation cross. This was followed by a brief 300 ms target stimulus and a subsequent 5.7 s delay period during which a red fixation cross was presented. Target stimuli were brief to encourage expectation of the stimulus by the participant. The trial concluded with a probe stimulus that was always consistent in stimulus category with the target stimulus. Participants were instructed to indicate whether or not the probe was exactly the same stimulus as the target by responding with a button press (right for match, left for nonmatch) as quickly as possible without sacrificing accuracy. The probe stimulus was followed by a 9.7 s intertrial interval during which a gray fixation cross was present. Trials with shorter expectation periods were inserted at random to encourage early stimulus expectation. Four of these trials were included in each WM block, two trials with 2 s expectation periods and two trials with 4 s expectation periods, all of which were excluded from the final analysis by modeling these trials with a unique regressor in the general linear model (GLM). For the passive-view trials, delay and probe periods were removed to assure that all target stimuli (predictive, neutral, and passive view) were preceded by an equivalent period of fixation. Thus, passive-view trials began with a 6 s period of “no expectation,” followed by a “target” stimulus (300 ms) and a 9.7 s intertrial interval (gray cross). Trials were presented in a pseudorandomized block design [10 blocks total; 2 SKf, 2 SKs, 4 SU (SUf and SUs), 2 PV (PVf and PVs)] with 19 trials per WM block (15 of which were included in the final analysis) and 15 trials per passive-view block. During all delay periods, participants were explicitly instructed to “maintain a mental image” of the memoranda, and to avoid verbal mnemonic strategies. In postexperiment questionnaires, all subjects reported using visual imagery during delay periods as well as being awake and alert during the experiment and that faces were generally more difficult to recognize at probe than scenes.
Incidental LTM was assessed with a surprise postexperiment recognition test that took place ∼30 min after the main experiment. Participants were presented images of faces and scenes from each condition in the experiment. Stimuli were presented at a self-paced rate. Sixty percent of tested stimuli were the memoranda from nonmatch trials and thus were viewed only once before the postexperiment test. As lures, the remaining 40% of stimuli were novel face or scene images. Participants were instructed to respond with a confidence score using a four-point forced choice Likert scale for each stimulus (4, confident the stimulus did appear in the experiment; 3, less confident the stimulus did appear in the experiment; 2, less confident the stimulus did not appear in the experiment; 1, confident the stimulus did not appear in the experiment). To normalize confidence scores to each participant's response bias, indices for each condition were calculated as the confidence score for stimuli of a particular condition minus the confidence score for category-specific novel images.
Region of interest localization.
An independent functional localizer task was used to identify the face-selective fusiform face area (FFA) (Kanwisher et al., 1997) and the scene-selective parahippocampal place area (PPA) (Epstein and Kanwisher, 1998) in the visual association cortex of each participant. Participants performed 10 blocks of a 1-back task. Each block was 16 s in length and included face stimuli, scene stimuli, or fixation (rest). Blocked face and scene stimuli regressors were used to generate SPM[T] images, from which regions of interest (ROIs) were identified. For the contrast of faces > scenes, a face-selective ROI, the right FFA, was identified as the cluster of 35 contiguous voxels with the highest t value within the right fusiform gyrus of each participant; the right FFA has been shown to be more strongly activated by faces (Bentin et al., 1996; Kanwisher et al., 1997). For the contrast of scenes > faces, a scene-selective ROI, the left PPA, was also identified as the cluster of 35 contiguous voxels with the highest t value within the left parahippocampal gyrus of each participant. The left PPA has been shown to be more selective for scenes (Kanwisher et al., 1997) and the strongest region of attentional modulation for scenes (Gazzaley et al., 2005). The ROI voxel extent was based on methodology from similar studies (Gazzaley et al., 2004; Rissman et al., 2004; Clapp et al., 2010) and was used to achieve a reasonable balance between regional specificity (diminished by the use of a larger cluster) and susceptibility to noise (a problem with smaller clusters). The contrast of rest > n-back was used to identify ROIs from the default network; ROIs were chosen as the cluster of 100 contiguous voxels with the highest t value within anatomical regions of the default network (Gusnard et al., 2001; Raichle et al., 2001). These ROIs were expected to be larger based on previous accounts of these functional regions (Fox et al., 2005; Bar, 2007; Buckner and Vincent, 2007; Corbetta et al., 2008; Mayer et al., 2010); therefore, a larger cluster extent was applied.
Data acquisition and analysis.
All images were acquired on a Siemens 3T Magnetom Trio with stimuli presented on a liquid crystal display monitor positioned behind the head of participants and viewed using a mirror rigidly attached to a 12-channel receive-only head coil. Echo planar imaging data were acquired [flip angle (FA), 80°; echo time (TE), 30 ms; repetition time (TR), 2000 ms] with 29 interleaved 3.0 mm axial T2*-weighted slices (0.5 mm interslice gap) and a 1.8 × 1.8 × 3.0 mm voxel size [field of view (FOV), 230 mm; 128 × 128 matrix]. In addition, high-resolution (T1-MPRAGE) anatomical volumes were acquired [1 × 1 × 1 mm voxel size; FOV, 160 × 240 × 256 mm; TR, 2300 ms; TE, 3 ms; FA, 9°]. Raw blood oxygen level-dependent (BOLD) data were corrected off-line for slice-timing acquisition and motion artifacts. A 5 mm isotropic Gaussian smoothing kernel was applied before modeling the data. All trial stages were modeled as stick functions (events) convolved with the canonical synthetic hemodynamic response function (HRF) (SPM5; Wellcome Department of Imaging Neuroscience, London, UK) in a GLM, similar to a recent report on prestimulus periods (Puri et al., 2009). The use of events and the canonical HRF regressors for each trial stage achieved minimal covariance between the expectation regressors and encode and probe regressors. The onset of the expectation regressor was time-locked with the gray-to-green fixation-cross color change, the onset of the encode regressor was time-locked with target-stimulus onset, and the onset of the probe regressor was time-locked with probe-stimulus onset. In addition, three translational (X, Y, Z) and three rotational (pitch, roll, yaw) motion parameters were included in the GLM. The resulting regression vector yielded scalar β weights corresponding to the relative changes in signal strength associated with a particular trial stage. Incorrect trials were modeled with a separate regressor and excluded from the final analysis. Group whole-brain maps were calculated from Montreal Neurological Institute (MNI) normalized data, and all figures appear in neurological convention. For all analyses, a single-voxel statistical threshold of p < 0.01 was used. Where applicable, we performed Monte Carlo simulations using the AlphaSim function in the AFNI toolbox (Cox, 1996) to determine a cluster extent of 35 nearest-neighbor voxels to achieve a corrected p < 0.05. All values are presented as the mean ± SEM. Statistical significance of behavioral changes was assessed using repeated-measures ANOVAs and paired two-tailed t tests. Statistical significance of univariate effects were assessed using paired two-tailed t tests.
A major aim of this study was to examine top-down control networks associated with expectation-driven baseline shifts that are largest in magnitude under predictive conditions (Puri et al., 2009). For this reason, SUf and SUs (nonspecific expectation) were the primary baseline conditions contrasted against the associated predictive conditions (SKf and SKs). The contrast SKf > SUf (or SKs > SUs) permits a comparison of identical WM tasks, with only the added certainty that a face (or a scene) stimulus will appear for the predictive condition, thus revealing networks specifically involved in predictive expectation.
Functional connectivity.
Functional connectivity network maps were created for each subject as described previously using a β-series correlation analysis approach (Gazzaley et al., 2004; Rissman et al., 2004). For this analysis, a new GLM design matrix was constructed to model each trial stage (expectation, encode, probe) with a unique covariate, resulting in 516 covariates of interest [(19 trials × 8 WM blocks × 3 covariates per WM trial) + (15 trials × 2 PV blocks × 2 covariates per PV trial)]. ROI (FFA and PPA) β values were then correlated across trials with every brain voxel resulting in 516 per-voxel measures of covariate activity with each ROI for each participant. To minimize the impact of noise introduced by smoothing and normalization on the correlation procedure, the β-series correlation analysis was first conducted in each participant's native space. Using normalization parameters derived from each participant's mean echo planar image, single-participant statistical maps were subsequently normalized to the MNI template (2 × 2 × 2 mm voxel size) and Gaussian smoothed (5 mm full-width at half-maximum) for group analysis. β-Series connectivity main effect analysis derived t maps for each condition. Although multiple trial stages were modeled, only the expectation period was subject to analysis. Images depict regions that survived Bonferroni's correction for all noncerebellar brain voxels (∼1.6 × 105; t = 7.69) (see Fig. 4). Nonparametric permutation tests were used to calculate whole-brain contrast maps between conditions (Nichols and Holmes, 2002) (see Fig. 5). Functional-connectivity maps were corrected for multiple comparisons in a manner identical with univariate maps. Statistical significance of functional connectivity changes were assessed using paired two-tailed t tests.
Regression analysis.
To further explore potential sources of prestimulus FFA activity modulation, functional connectivity measures were extracted from significant frontoparietal regions (SKf > SUf) and default network regions [PVf > (SKf + SUf)], as well as default network ROIs selected based on the functional localizer contrast rest > n-back. Across-participant connectivity values from each ROI were regressed against across-participant prestimulus FFA activity modulation measures for each condition, as well as indices for differences between conditions. Resultant Pearson's r coefficients and corresponding p values for regions that survived false discovery rate (FDR) correction for multiple comparisons are reported (see Fig. 6). Note: The statistical relevance of any ROI was not driven by the data from any individual participant.
In addition, across-participant regression analyses were performed between connectivity indices (SKf–SUf) from each ROI and memory performance indices (SKf–SUf) for both WM recognition accuracy and LTM recognition scores. Resultant Pearson's r coefficients and corresponding p values for regions that survived FDR correction for multiple comparisons are reported (see Fig. 7).
Results
Behavioral performance
To examine the impact of predictive stimulus category information on WM performance, a 2 × 2 repeated-measures ANOVA for WM accuracy with stimulus category (face, scene) and instruction (predictive, neutral) as factors was performed. This revealed a significant effect of both stimulus category (F(1,17) = 30.12; p < 0.0001) and instruction (F(1,17) = 7.18; p < 0.05), as well as a stimulus category by instruction interaction (F(1,17) = 4.45; p < 0.05). Pairwise comparisons indicated participants performed significantly better on scene stimuli compared with faces (94.1 ± 1.3%; 88.1 ± 1.7%; t(17) = 5.57, p < 0.0001) and, importantly, when predictive instructions were given versus neutral instructions (92.5 ± 1.7%; 89.6 ± 1.3%; t(17) = 2.74, p < 0.05) (Fig. 2A). Post hoc t tests indicated the accuracy-enhancing effect of predictive information was driven by face performance, such that SKf > SUf (90.8 ± 2.1%; 85.4 ± 1.7%; t(17) = 3.01, p < 0.01), whereas no difference in accuracy was observed for scene stimuli (SKs, 94.3 ± 1.4%; SUs, 93.9 ± 1.5%; t(17) = 0.329, p > 0.7).
For reaction times, a 2 × 2 repeated-measures ANOVA with stimulus category (face, scene) and instruction (predictive, neutral) as factors revealed no main effect of stimulus category or instruction (p > 0.1), but an interaction (F(1,17) = 5.07; p < 0.05). Post hoc t tests revealed a difference between predictive and neutral conditions for scene stimuli only, such that SKs > SUs (724.16 ± 57.4 ms; 665.66 ± 49.33 ms; t(17) = 3.27, p < 0.01).
For the LTM recognition test, a 2 × 3 repeated-measures ANOVA of recognition indices with factors of stimulus category (face, scene) and instruction [predictive, neutral, passive (not in figure)] revealed a main effect of instruction (F(1,17) = 15.72; p < 0.001), but not stimulus category (F(1,17) = 0.04; p > 0.8) and an instruction by stimulus category interaction (F(1,17) = 8.79; p < 0.01) (Fig. 2B). Post hoc t tests indicated that, after the main experiment, participants remembered stimuli better when they were from the predictive condition (0.64 ± 0.10) than the neutral condition (0.47 ± 0.098; t(17) = 2.84, p < 0.05) or the passively viewed condition (0.19 ± 0.68; t(17) = 4.76, p < 0.001), and stimuli from the neutral condition better than the passively viewed condition (t(17) = 4.60; p < 0.01). Additional analysis revealed that participants remembered faces from the predictive condition (0.54 ± 0.083) better than from the neutral condition (0.40 ± 0.075; t(17) = 2.50, p < 0.05), whereas predictive and neutral condition scenes were equivalently recalled (0.74 ± 0.16; 0.54 ± 0.098; t(17) = 2.10, p > 0.05). Neutral-condition stimuli were remembered better than passive-view stimuli for scenes (−0.025 ± 0.11; t(17) = 4.92, p < 0.01), but not for faces (0.39 ± 0.076; t(17) = 0.039, p > 0.9).
Univariate results
Functionally defined visual cortical ROIs
Expectation-driven baseline activity shifts in stimulus-selective visual regions have recently been observed during object-based, perceptual tasks (Giesbrecht et al., 2006; Puri et al., 2009; Esterman and Yantis, 2010). We hypothesized that similar biasing effects would be observed in the context of a WM task. To evaluate this, we examined prestimulus univariate activity in stimulus-selective visual regions from each participant (i.e., FFA and PPA). Prestimulus univariate FFA activity was increased in the predictive condition for remembering faces versus the corresponding neutral and passive conditions (SKf > SUf and PVf; 3.36 ± 0.690, 1.87 ± 0.711, 1.24 ± 0.609; t(17) = 2.49, p < 0.05; t(17) = 3.13, p < 0.001), as well as versus the predictive condition for remembering scenes (i.e., faces were “uncued”) (SKf > SKs; 1.07 ± 0.656; t(17) = 2.73, p < 0.05) (Fig. 3). Of note, SUf = PVf = SKs (all comparisons, p > 0.3). These results indicate specificity for WM expectation-driven baseline shifts in category-specific visual regions, comparable with the shifts observed in the setting of perceptual tasks.
Analysis of prestimulus activity in the PPA revealed no analogous baseline shift for scenes, such that SKs = SUs = SKf = PVs (0.93 ± 0.571, 1.39 ± 0.545, 1.33 ± 0.362, 0.83 ± 0.505) (Fig. 3) (all comparisons, p > 0.5). The absence of a significant baseline increase in this scene-specific region mirrors the absence of benefits from predictive information on WM and LTM performance for scenes. It is possible that the lack of predictive-cue effects for scene memory and PPA modulation were the result of a performance ceiling effect. Evidence against this, however, was found by evaluating the subgroup of participants whose WM performance measures for scene stimuli were lowest. In the group of the nine lowest performing participants for scene stimuli, whose scene accuracy was statistically equivalent to the group performance for faces (89.8 ± 1.4%; 88.1 ± 1.7%; t(25) = 0.64, p = 0.45, unpaired t test), only four participants performed better in the predictive condition. Additionally, this subgroup's WM performance (90.4 ± 1.9%; 89.2 ± 1.9%; t(8) = 0.46, p = 0.66, paired t test) and prestimulus PPA activity (0.65 ± 0.78; 1.21 ± 0.88; t(8) = 1.11, p = 0.30, paired t test) were not statistically different for predictive versus neutral conditions. This implies that the effect of selectivity for faces observed in the current study was genuine. Perhaps, as others have suggested, the diversity of scene features makes it difficult to generate a robust template during the expectation period, unlike faces, which are more stereotyped (Esterman and Yantis, 2010). This is not to say that the PPA cannot be activated before stimuli by predictive cues (Puri et al., 2009; Esterman and Yantis, 2010); however, this may be a reflection of task differences. Specifically, previous reports used house stimuli rather than natural scenes as in the current paradigm. Although houses and natural scenes both activate the PPA, the stereotyped format of houses may facilitate anticipatory modulation. Consistent with this interpretation, recent data revealed that more specific cues (e.g., “Penn bookstore” vs “classroom”) generate greater anticipatory activity in PPA (Epstein and Higgins, 2007). The hippocampus/parahippocampal gyrus system has also been implicated in predictions and the imagining of future events (Schacter et al., 2007, 2008). However, the PPA has been shown to be anatomically and functionally distinct from object category-insensitive memory regions (Prince et al., 2009).
Whole-brain analysis
A central aim of the current study was to examine neural mechanisms of top-down control involved in expectation-driven baseline shifts and memory benefits; therefore, the remainder of the analysis focuses on face-present trials only, where these effects were observed. Furthermore, given our goal of characterizing the processes that bias sensory processing during “specific” expectation (i.e., an ensuing stimulus category can be predicted with 100% certainty, SKf), SUf served as a “nonspecific” (neutral) expectation control. Examination of whole-brain univariate data using the primary contrast of interest, SKf > SUf, revealed significant activation in frontoparietal regions comparable with those previously reported during expectation periods for perceptual tasks (Kastner et al., 1999; LaBar et al., 1999; Shulman et al., 1999; Corbetta and Shulman, 2002; Todd and Marois, 2005; Vossel et al., 2006; Corbetta et al., 2008; Capotosto et al., 2009; Puri et al., 2009; Esterman and Yantis, 2010). This included the right IPS, right FEF, right MFG, right precentral gyrus, and right dorsal SMG. To assess whether these effects were also greater than the passive condition (“null” expectation), an examination of the SKf > (PVf + SUf) contrast was conducted and revealed a nearly identical set of frontoparietal regions, including right IPS, right FEF, right MFG, right precentral gyrus, and right dorsal SMG. These data support the hypothesis that the dorsal attention network is associated with top-down modulation of task-relevant sensory-cortical target areas during a period of specific expectation, even when goals are directed at memory encoding rather than perceptual objectives.
Functional connectivity results: expectation networks
It has been proposed that frontal and parietal regions that comprise the dorsal attention network generate top-down signals that bias processing of expected stimuli in sensory cortices (Kastner et al., 1999; Shulman et al., 1999; Hopfinger et al., 2000; Corbetta and Shulman, 2002; Serences et al., 2004; Giesbrecht et al., 2006; Silver et al., 2007; Sylvester et al., 2007; Bressler et al., 2008; Summerfield and Egner, 2009; Esterman and Yantis, 2010). Although univariate findings that report coincident expectation-driven frontoparietal and visual-cortical activity baseline shifts support such a claim (including the univariate results presented in the current study), this evidence is indirect. The β-series correlation method is a functional connectivity analysis approach that uses trial-by-trial variability to measure covariance in activity between spatially disparate regions, and thus offers a more powerful tool for assessing network interactions (Gazzaley et al., 2004; Rissman et al., 2004). FFA connectivity main-effect maps obtained during the expectation period of the SKf condition revealed a set of regions similar to, but not directly overlapping, those described in the univariate analysis as being nodes of the dorsal attention network: bilateral MFG (1 and 2), right inferior frontal junction (IFJ) (3), right IPS (4), and right precuneus (5) (red arrows) (Fig. 4A). FFA connectivity with these regions was not significant in the main-effect maps for the SUf and PVf conditions (Fig. 4B,C). A nonparametric analysis was used to contrast FFA connectivity maps during the expectation periods of different conditions (Nichols and Holmes, 2002). The main contrast of interest, SKf > SUf, revealed that FFA connectivity with frontoparietal network cortical regions, including bilateral MFG (1 and 2), right IFJ (3), right IPS (4), right precuneus (5), and right precentral gyrus as well as right inferior frontal gyrus (IFG) (6), were associated with specific expectation of the stimulus category (red arrows) (Fig. 5A, Table 1). In an additional analysis, a contrast of SKf > (PVf + SUf) was conducted and revealed a nearly identical profile of frontoparietal FFA connectivity, including bilateral MFG, right IFJ, right IPS, right precuneus, right IFG, and right precentral gyrus. Moreover, to examine FFA connectivity when there were WM goals, but no ability to predict a face stimulus with certainty, the contrast SUf > PVf was performed, and revealed no significant clusters. We thus conclude that these frontal and parietal regions are engaged in functional networks with the FFA only for specific stimulus category expectation.
To further examine the relationship between frontoparietal–FFA connectivity and FFA baseline shifts, we conducted an across-participant regression analysis of FFA connectivity with frontoparietal ROIs (identified by the SKf > SUf connectivity contrast) and the magnitude of prestimulus FFA activity in the SKf condition (corrected for multiple ROI comparisons). FFA connectivity magnitude positively correlated with FFA activity for the right IFJ (r = 0.67; p < 0.005) (Fig. 6A), right MFG (r = 0.81; p < 0.0001) (Fig. 6B), right IFG (r = 0.69; p < 0.005) (Fig. 6C), and right IPS (r = 0.73; p < 0.0005) (Fig. 6D). This same analysis was also conducted on similar ROIs derived from the SKf > (PVf + SUf) contrast, and revealed significant correlations between FFA connectivity measures and FFA activity modulation: right IFJ (r = 0.59; p < 0.01), right MFG (r = 0.66; p < 0.005), and right IPS (r = 0.68; p < 0.005). Although not causal, these data provide stronger evidence than univariate results, and even functional connectivity results alone, that these frontoparietal regions generate endogenous signals based on specific expectations of the nature of ensuing stimuli, which mediate top-down modulation of sensory cortical activity before stimulus presentation.
Interestingly, it was observed that several regions associated with the default network, a collection of regions that are more active at rest than during a task (Gusnard et al., 2001; Raichle et al., 2001), displayed significant FFA connectivity during the prestimulus period that preceded face stimuli in the passive-view condition, but not SKf or SUf conditions [e.g., posterior cingulate cortex (PCC) (7) and medial prefrontal cortex (mPFC) (8) (red arrows)] (Fig. 4C). To further investigate whether there were regions that displayed FFA connectivity decreases during the expectation period of the predictive face condition, we performed a functional connectivity contrast of SKf < (PVf + SUf). A pattern emerged of reduced default-network–FFA connectivity, which included retrosplenial cortex, bilateral PCC, bilateral regions of the medial temporal lobe, as well as regions of the lateral temporal cortex (Table 1). To explore the possibility that functional connectivity decreases during expectation may be invariant of specific expectation goals (i.e., occurs for both predictive and neutral conditions), we evaluated the contrast PVf > (SKf + SUf). This contrast revealed task-related decreases in prestimulus FFA connectivity with a similar set of default network regions, also including the PCC (7), mPFC (8), and lateral parietal cortex (red arrows) (Fig. 5B, Table 1). These expectation-driven decreases in connectivity suggest that functional disconnection of the FFA from several default network regions are an aspect of preparing to encode an ensuing relevant stimulus, independent of specific stimulus category foreknowledge.
To examine the relationship between these functional connectivity reductions and FFA prestimulus activity shifts, across-participant regression analyses of FFA connectivity with these default-network regions and prestimulus FFA baseline shifts were performed for the SKf condition. This revealed a significant negative correlation only for the PCC region (r = −0.59; p < 0.01), using passive view as a baseline for both connectivity and FFA activity shifts (Fig. 6E). These results suggest that, for the default network, notably the PCC, functional disconnection from sensory cortices, which occurs under conditions of both predictive and neutral expectation, is permissive but not sufficient for the expectation-related baseline shifts, which also requires increased connectivity with frontoparietal regions that occurs only during predictive conditions.
To explore these findings more formally, default-network ROIs were identified using the independent localizer task [contrast fixation (rest) > n-back] in the mPFC, PCC, and lateral parietal cortices. Results indicated that the PCC (region 7) (Fig. 4) displayed significant differential preparatory connectivity with the FFA, such that SKf < PVf (0.27 ± 0.05; 0.40 ± 0.03; t(17) = 4.41, p < 0.001) and SUf < PVf (0.27 ± 0.05; t(17) = 5.91, p < 0.0001), whereas SKf = SUf (t(17) = 0.71; p > 0.4). Interestingly, prestimulus PPA–PCC connectivity measures were statistically equivalent across all scene conditions (all values of p > 0.23). This may offer a mechanistic account of the null PPA univariate effect. FFA–PCC connectivity was then subjected to the same regression analysis with FFA activity as described previously, which once again revealed a significant negative correlation between FFA–PCC connectivity and FFA baseline shift (r = −0.58; p < 0.05).
Neurobehavioral correlations
Predictive instructions resulted in a benefit on both WM and LTM performance for faces (i.e., SKf > SUf). To investigate whether expectation-driven network changes were associated with this benefit, we conducted an across-participant, neurobehavioral regression analysis using FFA connectivity values with frontoparietal ROIs, identified by the SKf > SUf connectivity contrast (Fig. 5A), and performance measures. The regression of FFA connectivity (SKf–SUf) versus WM recognition accuracy (SKf–SUf) and LTM recognition scores (SKf–SUf) revealed significant correlations with the right IFJ (r = 0.72; p < 0.001) and right precuneus (r = 0.66; p < 0.005) for the WM comparisons, and with the left MFG (r = 0.74; p < 0.0005) for LTM comparisons (Fig. 7). The same regressions using ROIs derived from the SKf > (PVf + SUf) contrast revealed similar positive correlations for the WM comparisons with the right IFJ (r = 0.65; p < 0.005) and right precuneus (r = 0.73; p < 0.001), and for LTM comparisons with the left MFG (r = 0.62; p < 0.005). Thus, in the current study, distinct neural networks were associated with predictive, expectation-mediated WM or LTM performance benefits. No significant neurobehavioral correlations were obtained using the default network regions in regression analyses.
These results add to the accumulating evidence that the dorsolateral prefrontal cortex (PFC), which includes the MFG, is involved in LTM processes (Rossi et al., 2001; Sandrini et al., 2003; Blumenfeld and Ranganath, 2006, 2007; Manenti et al., 2010a,b). To evaluate the mechanism further, we used β-series functional connectivity analysis using the left MFG region as a seed. This analysis revealed increased expectation period connectivity of left MFG with bilateral hippocampi in the predictive condition (i.e., SKf > SUf and SKf > PVf). This suggests that the dorsolateral PFC may contribute to the long-term retention of task-relevant information via prestimulus increases in functional connectivity with the hippocampus, a structure well known for its involvement in LTM (Eichenbaum et al., 2007). In contrast, connectivity analysis using right IFJ and right precuneus as seeds (the regions found to correlate with WM accuracy) did not reveal task-differential connectivity with the hippocampus. Although recent results revealed that prestimulus hippocampal activity is beneficial for LTM performance (Park and Rugg, 2010), the current results advance this notion by suggesting that it may be mediated via top-down influences from the dorsolateral PFC.
Discussion
The current study provides evidence that expectations regarding the specific category of ensuing complex stimuli (i.e., faces) enhance both WM and LTM of those stimuli. fMRI connectivity analysis, used to explore the neural networks underlying memory benefits, revealed a frontoparietal network of regions functionally connected with stimulus-selective visual association cortex (i.e., right FFA) before stimulus presentation, only when specific predictive information was supplied. Furthermore, the magnitude of FFA connectivity with a subset of these regions (right IFJ, right MFG, right IFG, and right IPS) correlated with prestimulus activity modulation in the FFA (baseline shifts). It was also revealed that connectivity between the FFA and the PCC, a region of the default network, decreased during periods of both predictive and neutral WM expectation, and that the magnitude of this decrease correlated with FFA baseline shifts in the predictive condition. Finally, nonoverlapping neural networks were shown to correlate with expectation-driven benefits in WM and LTM performance.
This study first addressed the important cognitive question: Does prestimulus expectation benefit the short- and long-term memory of complex stimuli, or alternatively, does expectation have no effect on these measures? To our knowledge, only one other study investigated a related aspect of this question and revealed that prestimulus, predictive attentional capture (i.e., sudden-onset spatial cues) positively influenced visual spatial WM (Schmidt et al., 2002). We extend this finding by revealing that the benefit on WM can occur with purely top-down goals and that it can occur for category cueing. Furthermore, enhanced LTM of these stimuli is also engendered by predictive foreknowledge. Note that the current results do not distinguish the influence of predictions on WM and LTM from influences of attention on these cognitive operations, which has been studied extensively (for review, see Naghavi and Nyberg, 2005). A non-mutually exclusive interpretation of the current findings is that predictions help to guide attention, which then results in memory performance enhancements. Thus, attention may be governed by the predictive signal, but not identical with it. Also note that we differentiate the current experimental manipulation from those used in the extensive literature on the influence of motivation on memory (LaBar et al., 1999; Engelmann et al., 2009; Pessoa et al., 2009). Our view is that expectations based on foreknowledge of the reward value of ensuing stimuli (e.g., monetary or emotional valence) are distinct from expectations driven by previous information of the nature of ensuing stimuli (as in the current study), although there are likely overlapping neural processes that warrant additional investigation.
An important aspect of the current study design is that, in the neutral conditions, participants knew that either faces or scenes would be presented, and thus this task may be viewed as a “predictive” condition as well (e.g., they know the stimulus would not be a house). However, clearly the neutral condition supplied less predictive information than conditions in which a face or a scene could be anticipated with 100% certainty. Our results revealed that only the SKf condition resulted in prestimulus modulation and engaged the frontoparietal network, and it also exhibited more successful memory performance than the SUf condition. The data do not allow us to distinguish between two alternative explanations for this finding: under neutral cueing, participants do not engage anticipatory mechanisms at all, or they attempt to engage these mechanisms in anticipation of both stimulus categories, but are ineffective in doing so.
Several previous studies have reported univariate fMRI evidence that expectation-related activity measures from dorsal-attention network regions are associated with baseline shifts in sensory regions. In the current study, functional connectivity analysis was used to more directly explore the role of putative control regions on baseline activity shifts in visual cortex. Only a few studies have taken this analytical approach. Most related, Bressler et al. (2008) used Granger causality to reveal that prestimulus activity in FEF and IPS predicted visual cortex activity baseline shifts in a spatial-cueing, perceptual discrimination task. In another recent study, functional connectivity of lateral parietal cortex with ventral visual cortices was reported to occur prestimulus in a perceptual recognition task, although the experiment did not reveal selective activity baseline shifts in visual cortex (Eger et al., 2007). The results presented here revealed that baseline shifts in the FFA occurred during the predictive condition for face stimuli, not in the neutral or passive viewing conditions, and that regions in the frontal and parietal cortices were functionally connected with the FFA during the expectation period. The set of regions identified was representative of several key regions previously characterized as the dorsal attention network, notably the MFG, IFJ, and IPS, as well as the right IFG, associated with the ventral attention network (Corbetta and Shulman, 2002; Shulman et al., 2009). Additional evidence that the degree of connectivity predicts the magnitude of the FFA baseline shifts provides a stronger indication that these areas mediate prestimulus, visual cortex modulation. Of note, caution in interpreting these results is warranted, as they are correlational in nature and do not permit causal inference. However, consistent with these results, recent studies reporting disruption of visual target detection and posterior alpha-band topography (Capotosto et al., 2009), as well as visual cortical BOLD measures (Ruff et al., 2008) by repetitive transcranial magnetic stimulation to right IPS, support a causal role of these frontoparietal regions in modulation of sensory cortical activity.
Key regions identified in the current study were located in the parietal cortex (IPS, precuneus) and were functionally connected with the FFA only during the expectation period of the predictive condition. The magnitude of IPS–FFA functional connectivity correlated with the baseline activity shifts in the FFA. These regions have been primarily identified in studies of expectation for locations (Kastner et al., 1999; Hopfinger et al., 2000; Bressler et al., 2008; Sylvester et al., 2008), although Shulman et al. (1999) provided evidence for IPS involvement in feature-based attention. The data presented here suggest that parietal regions are also involved in nonspatial, object category attention and may mediate the prestimulus biasing of task-relevant ensembles in ventral visual cortices to more efficiently encode relevant information into WM stores.
Several frontal regions were also identified in the network analysis, notably the right IFG, IFJ, and MFG, which were most functionally connected with the FFA during the expectation period of the predictive condition, the degree of which correlated with baseline activity shifts in the FFA. Previous studies have shown, and the current results suggest, a role for the right IFG in expectation-driven attention based on behavioral goals (Shulman et al., 2009). A nearby region, the IFJ, defined by Brass and von Cramon (2004), is located at the intersection of the inferior frontal sulcus and precentral sulcus, and has been shown to exhibit increased activity in prestimulus periods of discrimination tasks. The ambiguous locale of this region at the intersection of two major gyri has led to substantial variability in the terminology used to describe it. For example, the right posterior MFG region identified by previous studies as being active in the expectation period of spatial attention tasks (He et al., 2007; Szczepanski et al., 2010), and also as a key node that bridges the dorsal and the ventral attention networks (Fox et al., 2005; He et al., 2007; Corbetta et al., 2008), overlaps with the region designated in the current study as the IFJ. Accumulating evidence suggests that the IFJ is a functionally discrete region (Brass and von Cramon, 2004; He et al., 2007; Szczepanski et al., 2010), and a recent report has implicated the IFJ in attentional modulation of V4 color processing (Zanto et al., 2010). The current study supports this assertion by revealing a dissociation in the influence of the IFJ and MFG connectivity on memory performance. The data revealed that expectation-driven increases in IFJ–FFA connectivity was associated with improved WM, but not LTM performance, and that MFG–FFA connectivity was associated with improved LTM, but not WM performance. This suggests that the IFJ may control expectation-driven biasing of task-relevant sensory cortical regions in a similar manner to the precuneus, which benefits the active retention of face representations. Additional analysis directed at elucidating the mechanism of this dissociation revealed that left MFG–hippocampal functional connectivity was greater for the predictive than neutral condition, whereas this was not evident for IFJ–hippocampal connectivity. The role of the left MFG in LTM is well documented (Rossi et al., 2001; Sandrini et al., 2003; Blumenfeld and Ranganath, 2006, 2007; Manenti et al., 2010a,b), and the data presented here support assertions that the PFC and the hippocampus functionally interact in the service of recognition memory (Bunge et al., 2004; Karlsgodt et al., 2005; Staresina and Davachi, 2006; Israel et al., 2010).
The default network is a set of regions that have been shown to exhibit greater activity during periods of rest and passive fixation than during goal-directed tasks (Gusnard et al., 2001; Raichle et al., 2001). Since its initial description, interest in the role the default network plays in goal-directed activities has steadily grown. Regions of the default network consistently display task-induced deactivations across a wide variety of cognitive tasks (e.g., attention, memory, language processing, and motor) (Shulman et al., 1997; Binder et al., 1999; Mazoyer et al., 2001). In addition, increasing WM load and demands on visual attention for target detection lead to greater deactivation (McKiernan et al., 2003, 2006; Todd and Marois, 2005; Tomasi et al., 2007). However, the functional role of the default network in periods of prestimulus expectation has not yet been investigated, although its involvement has been speculated (Bar, 2007). The current study reveals that functional connectivity between the PCC, a region of the default network, and the FFA, decreases during expectation of a WM task, relative to passive view. This occurs independent of the predictive nature of the instructional information (i.e., for both predictive and neutral conditions).
The results presented here advance our understanding of the beneficial role that prestimulus expectation has on behavioral performance by revealing a positive impact on subsequent WM and LTM performance. Consistent with previous studies using perceptual tasks, performance improvements were associated with baseline shifts of visual cortical activity. Functional connectivity analysis further revealed that the neural mechanisms involved in such baseline shifts depend on two expectation-driven, prestimulus network changes: (1) a functional disconnection of visual areas from integration with default network regions based on general expectations of a WM task, and (2) an increase in functional connectivity between visual areas and the dorsal and ventral attention networks that is dependent on the predictability of stimulus category. Our findings suggest that attention and default networks engage with sensory cortices in a dynamic and flexible manner to modulate attentional orienting on the basis of expectations about ensuing stimuli, and that an individual's ability to achieve such engagement predicts their memory benefits.
Footnotes
-
This work was supported by National Institutes of Health Grant 5R01AG030395 (A.G.). We thank Edrick Masangkay and Chips McSteeley, Jr., for their assistance.
- Correspondence should be addressed to Adam Gazzaley, University of California, San Francisco–Mission Bay, Genentech Hall, Room N472J, 600 16th Street, San Francisco, CA 94158-2240. adam.gazzaley{at}ucsf.edu