Abstract
The sensory recruitment hypothesis states that visual short-term memory is maintained in the same visual cortical areas that initially encode a stimulus' features. Although it is well established that the distance between features in visual cortex determines their visibility, a limitation known as crowding, it is unknown whether short-term memory is similarly constrained by the cortical spacing of memory items. Here, we investigated whether the cortical spacing between sequentially presented memoranda affects the fidelity of memory in humans (of both sexes). In a first experiment, we varied cortical spacing by taking advantage of the log-scaling of visual cortex with eccentricity, presenting memoranda in peripheral vision sequentially along either the radial or tangential visual axis with respect to the fovea. In a second experiment, we presented memoranda sequentially either within or beyond the critical spacing of visual crowding, a distance within which visual features cannot be perceptually distinguished due to their nearby cortical representations. In both experiments and across multiple measures, we found strong evidence that the ability to maintain visual features in memory is unaffected by cortical spacing. These results indicate that the neural architecture underpinning working memory has properties inconsistent with the known behavior of sensory neurons in visual cortex. Instead, the dissociation between perceptual and memory representations supports a role of higher cortical areas such as posterior parietal or prefrontal regions or may involve an as yet unspecified mechanism in visual cortex in which stimulus features are bound to their temporal order.
SIGNIFICANCE STATEMENT Although much is known about the resolution with which we can remember visual objects, the cortical representation of items held in short-term memory remains contentious. A popular hypothesis suggests that memory of visual features is maintained via the recruitment of the same neural architecture in sensory cortex that encodes stimuli. We investigated this claim by manipulating the spacing in visual cortex between sequentially presented memoranda such that some items shared cortical representations more than others while preventing perceptual interference between stimuli. We found clear evidence that short-term memory is independent of the intracortical spacing of memoranda, revealing a dissociation between perceptual and memory representations. Our data indicate that working memory relies on different neural mechanisms from sensory perception.
Introduction
Although a focus of research for decades, the neural basis of working memory storage is still disputed (Serences, 2016; Xu, 2017). Recent neuroimaging studies have demonstrated that items in memory can be decoded from activity in human primary visual cortex (V1). Whereas the amplitude of the blood-oxygenation-level-dependent (BOLD) signal within V1 is not predictive of a remembered stimulus, patterns of activity across voxels can predict memoranda reliably (Harrison and Tong, 2009; Serences et al., 2009). For example, in a study by Harrison and Tong (2009), observers viewed two sequentially presented oriented gratings and were cued to hold one item in memory so that they could later compare it with a test grating. These investigators found that the remembered stimulus orientation could be decoded from patterns of activity within V1 during the retention interval. They concluded that visual cortex retains information about features in working memory. Similar studies have found that activity patterns within early visual cortex are specific to only the task-relevant feature of multifeature objects (Serences et al., 2009) and that the precision of decoding diminishes with increasing numbers of memoranda (Emrich et al., 2013; Sprague et al., 2014).
These findings among others have led some researchers to conclude that memory storage mechanisms are located within the sensory neural systems involved in processing the stimulus attributes, a proposal termed the sensory recruitment hypothesis (Pasternak and Greenlee, 2005; Emrich et al., 2013; Sreenivasan et al., 2014; Serences, 2016). This hypothesis is appealing in part because visual cortex is thought to be one of the few brain areas with sufficient processing power to represent objects with the level of detail observed in short-term memory (for review, see Serences, 2016). However, it is not clear how visual cortex could maintain memory representations while simultaneously processing new incoming information nor how the different perceptual experiences of seeing versus remembering are accounted for by this hypothesis.
In contradiction to the sensory recruitment hypothesis, Bettencourt and Xu (2016) found that target features could not be decoded from early visual cortex when distractors were presented during the memory retention period, but that such distractors had no impact on behavioral performance. These investigators could decode activity reliably within a region of parietal cortex to predict the target stimulus regardless of whether a distractor was presented, suggesting an important role of that area in short-term memory. It remains contentious, therefore, whether visual cortex plays a necessary role in short-term memory maintenance (Xu, 2017).
In the present study, we tested whether the fidelity with which memoranda are stored is affected by the neural resources available within early visual cortex by varying the intracortical spacing of items. When items are presented simultaneously in the absence of working memory demands, their intracortical spacing is the primary constraint on their perceptual discriminability. Nearby stimuli “crowd” each other and the zone of crowding is determined by the distance between stimuli in retinotopic cortex (Pelli and Tillman, 2008; Pelli, 2008). Visual crowding occurs when the cortical spacing between visual objects prevents a distinct target representation in early visual cortex (Pelli, 2008; van den Berg et al., 2010; Anderson et al., 2012; Kwon et al., 2014; Chen et al., 2014) or results in pooling of stimulus representations at later levels of the visual hierarchy (Freeman and Simoncelli, 2011). If short-term memory of items presented in spatial isolation is maintained via the recruitment of the same sensory areas involved in the encoding of those features, then we should see worse memory performance for items that are closer together in visual cortex, and therefore share more neural resources, than for items with greater intracortical spacing.
Materials and Methods
Experiment 1 overview
We investigated whether log-scaling of visual cortex affects short-term memory by having observers remember three items on each trial arranged according to one of two spatial configurations and, using a method of adjustment, report the orientation of the item indicated by a probe. Within a trial, items were aligned along either the tangential axis or the radial axis and thus had greater or lesser intracortical spacing, respectively (see Fig. 1A,B). Importantly, in each configuration, one item appeared at 10° eccentricity directly to the right of fixation, so targets at this location were matched in all aspects except for the intracortical spacing between memoranda within the same trial. We thus focus analyses only on target items at this location, although all locations were probed equally often so as to encourage participants to store all items in short-term memory. Finally, we ensured that our data were not confounded by perceptual interference (Yeshurun et al., 2015) by presenting items sequentially and with sufficient durations and interstimulus intervals to negate such perceptual effects.
Participants.
Ten people participated in Experiment 1 (mean age 24 ± 3.07; 5 male, 5 female). All had typical color vision and normal or corrected-to-normal acuity and were naive to the purposes of the experiment. All observers gave written informed consent and were paid £10/h for their participation. The study was approved by the University of Cambridge Psychology Research Ethics Committee.
Experimental setup.
Participants sat in a head and chin rest positioned 57 cm from an ASUS LCD monitor. The resolution of the monitor was 1920 × 1200 within an area that was 44.8 × 28 cm with no pixel interpolation. Stimulus colors were selected after measuring the luminance of each color channel of the monitor with a spectrophotometer. Fixation was monitored online with an EyeLink 1000 (SR Research) recording at 500 Hz, calibrated once before each testing session and recalibrated as required throughout the experiment (see below). The experiment was programmed with the Psychophysics Toolbox Version 3 (Brainard, 1997; Pelli, 1997) and Eyelink Toolbox (Cornelissen, Peters, and Palmer, 2002) in MATLAB (The MathWorks).
Stimuli.
On each trial, three randomly oriented bars (2° × 0.2° of visual angle) were presented sequentially and each was uniquely colored red, green, or blue. Colors were matched in luminance (26.2 cd/m2) and the order in which they appeared and their screen position were randomized across trials. A white fixation spot was displayed in the center of the screen throughout stimulus presentation and the memory delay period. All stimuli were presented on a black background (luminance < 1 cd/m2).
Within a trial, stimulus positions were arranged either tangentially or radially with respect to the point of fixation (see Fig. 1B). In both conditions, one item was centered on the horizontal meridian 10° right of fixation. In the radial condition, the two other items were positioned 2° left or right of the central item such that they were arranged along the horizontal meridian. In the tangential condition, one item was positioned 2° above the central item and the other was positioned 2° below the central item such that they were arranged orthogonal to the horizontal meridian. Although never presented simultaneously, the interstimulus spacing meant that their positions did not overlap. The order in which a stimulus was presented at each position was randomized across trials.
Procedure.
A typical trial sequence is shown in Figure 1C. At the start of each trial, an observer had to maintain fixation within a 2° region of the fixation spot for 500 ms for the trial to proceed. If fixation remained outside of this region for >2 s, then the eye tracker was recalibrated. Once correct fixation was registered, there was an additional variable delay between 250 and 750 ms (uniformly distributed). Stimuli were then presented sequentially in either a tangential or radial arrangement (see Fig. 1B). The stimulus duration and interstimulus interval were 500 ms. After the offset of the third stimulus, there was a 500 ms delay period, after which a probe circle (2° diameter) appeared centered on the location previously occupied by one of the three items, cueing the observer to report the orientation of that item using the mouse. Once any movement of the mouse was recorded, a response bar replaced the probe circle and followed the orientation designated by the mouse position relative to the bar center. The response bar had the same dimensions as the target item, but its orientation was randomized at the start of each response period and remained on screen until the observer clicked the mouse button to confirm their response.
During pilot testing with white stimuli, we noted that it was difficult to attribute clearly the probe circle to one memory item based on location alone, particularly for the radial condition. This is most likely due to the well known compression of perceptual space in peripheral vision (White et al., 1992; McGraw and Whitaker, 1999), so during Experiment 1, the probe circle and response bar also matched the color of the target item. Participants were informed that all items were equally likely to be the target. The target appeared equally as often across temporal order and location. There were 324 trials consisting of 18 repetitions for each target combination (three target locations for each of two spatial arrangements and three temporal orders).
If gaze position deviated by >2° from the fixation spot during stimulus presentation, the interstimulus interval, or the delay period, then the message, “Don't look away from the fixation point until it's time to respond,” appeared for 2 s and the trial restarted with newly randomized stimulus orientations. Each testing session took ∼1 h. After 50% of trials were completed, the observers were requested to take a short break, but were also instructed that they could rest at other times as they required.
Experimental design and analyses.
All comparisons in this experiment were within-subjects. Only trials in which the target item was positioned 10° to the right of fixation were analyzed. For items at this location, we compared memory performance across radial and tangential conditions with two measures collapsed across temporal order. We first analyzed the variability of report errors by calculating the circular SD of reports for each condition for each observer. These values were compared across conditions with a Bayesian t test using JASP software. We used the default Cauchy prior width of 0.707, but all results reported below were robust to standard alternate prior widths. Alongside Bayes factors, we provide Student's t test results.
In a second analysis, we assessed whether there was an influence of intracortical spacing on observers' reports using a probabilistic model of working memory performance. This was the “swap” model introduced by Bays et al. (2009) in which observers' responses are attributed to a mix of noisy reports centered on the target orientation, noisy reports centered on nontarget items, and a uniform lapse rate (see also Zhang and Luck, 2008). The details of this model have been described extensively previously (Bays et al., 2009; Gorgoraptis et al., 2011). The model has three free parameters: precision of reports, proportion of swap errors, and proportion of guesses. Parameters were estimated by maximum likelihood using code available online (http://www.paulbays.com/code/JV10; Bays et al., 2009).
We compared two versions of the model: a full model in which a separate set of parameters was used for radial and tangential conditions and a restricted model in which a single set of parameters was used for both conditions. To compare which of the two models best described the data, we used the Akaike Information Criterion (AIC) summed across participants. To further test whether the models accounted for the data differentially, we submitted the differences in individuals' AIC scores to Bayesian and Student's t tests.
Experiment 2 overview
Experiment 2 was designed to ensure that the physical spacing between memoranda would result in competing representations within V1. For each participant in Experiment 2, we first measured the critical spacing of crowding, which is the area within which crowding occurs (Pelli and Tillman, 2008). We then tested observers' memory for memoranda presented sequentially within versus beyond their critical spacing. Moreover, we tested whether there was a correlation between critical spacing and memory performance, which could arise if working memory is related to individual differences in cortical surface area (Schwarzkopf et al., 2011). To increase statistical power and to assess the correlation between critical spacing and memory performance, we greatly increased the sample size compared with Experiment 1.
Each participant first completed a crowding task in which we found the inter-item distance at which their ability to recognize a target reached threshold level, which we take as the critical spacing of crowding. A participant's basic task was to identify the orientation of a bar surrounded by a circle flanked on either side by distractors (see Fig. 3A). Target and distractors were briefly presented in the upper peripheral visual field and trial-by-trial variations in inter-item spacing were controlled by an adaptive procedure. The participant reported the target orientation by clicking on one of three response options shown around the point of fixation (a three-alternative forced-choice task). After finding their critical spacing, the participant then completed a memory experiment in which three randomly oriented bars were presented in sequence in one of two spatial configurations (see Fig. 4A). Within each trial, memoranda were presented across a spatial range equal to either 0.75 or 1.5 times their critical spacing, corresponding to “crowded” and “uncrowded” conditions, respectively. As in Experiment 1, there was a common screen position for one item in each condition and we analyzed only memory performance for this stimulus position. Therefore, any differences in performance across conditions could only be driven by differences in intracortical spacing of memoranda.
Participants.
Twenty-one participants took part in Experiment 2 (mean age 30.14 ± 8.69 years; 8 male, 13 female), one had also participated in Experiment 1 and all other details were as per the previous experiment. Two participants did not complete the experiment due to problems tracking their eyes and their data were excluded from analyses, leaving a final sample size of 19.
Experimental setup.
All details were as per Experiment 1.
Stimuli.
Stimuli were bars (0.85° × 0.04°) centered in a circle with a diameter matching the bar length and a width of 0.04° (see Figs. 3A, 4A). Three of such stimuli were displayed in each trial of both the crowding experiment and the memory experiment and were uniquely colored. We chose three colors equally spaced in CIE L*a*b* color space, approximating red (L* = 74, a* = 34.6, b* = 20), green (L* = 74, a* = −28.3, b* = 28.3), and blue (L* = 74, a* = −28.3, b* = −28.3) hues. Colors were assigned randomly to the three stimuli on each trial. A white fixation spot was displayed in the center of the screen throughout stimulus presentation and the memory delay period. All stimuli were presented on a black background.
In the crowding task, three oriented stimuli were presented simultaneously on each trial (see Fig. 3A). The target orientation was random and the distractors' orientations were selected randomly from a uniform distribution that excluded orientations within 22.5° of the target orientation. Stimuli were centered 8.5° above fixation and arranged tangentially relative to fixation. The center stimulus was the target and the others were distractors. As described below, the target–distractor distance was controlled via a staircase. Response stimuli were target and distractors in a neutral hue (gray) appearing in random positions but equally spaced on the border of an imaginary circle (radius = 1.7°) around the screen center (see Fig. 3A). When response stimuli were on screen, observers could move a standard mouse arrow that appeared in the screen center. In the memory experiment, memoranda were of the same dimensions as the target and distractors in the crowding experiment, were each randomly assigned the colors described above, but were presented sequentially in random order. Stimulus orientations in the memory experiment were randomized with no restrictions.
Procedure.
A typical trial sequence of the crowding task is shown in Figure 3A. Each trial began after fixation compliance as per Experiment 1. Target and distractors appeared for 500 ms. After a 500 ms delay, response stimuli and the response arrow appeared centered at fixation and observers moved the arrow with the mouse and clicked on which stimulus they thought matched the target orientation. Observers were instructed that the target was always the central item on every trial and that one response item matched its orientation exactly. No other instruction was given explicitly regarding the distractor response items, but if a participant asked about them, then the experimenter told them that one item matched the target and the other two response items matched the distractors. The next trial immediately followed each mouse click that fell within the border of a stimulus and that stimulus was taken as their response.
The distance between the target and each distractor was controlled on each trial via an adaptive procedure, QUEST (Watson and Pelli, 1983), which was set to find the target–distractor spacing at which performance reached 67% accuracy (the midpoint of the psychometric function for a 3AFC task). We ran two randomly interleaved staircases of 36 trials each. For each QUEST procedure, we set the initial midpoint of the psychometric function (μ, see below) to two different levels to probe the asymptotes of the fitted function. These values, based on pilot observations, were set to 3.4° and 1.7°. These different QUEST parameters have the added advantage that the participant initially experiences relatively difficult and easy trials early on during testing. Furthermore, we allowed the target–distractor distance to vary only in steps of 0.21° during this threshold task. The procedure took ∼7 min. Whereas there was inevitably a working memory component to the crowding task, only the central element needed to be held in memory, so performance in this task indexes crowding occurring in sensory processing due to the simultaneously presented flankers rather than in memory.
The memory experiment was conducted in the same session as the crowding task and is shown in Figure 4A. Fixation compliance was performed as above and then each memory item was shown in random order with a duration, interstimulus interval, and delay period of 500 ms. Memoranda were shown in one of two spatial configurations either spaced to fall within or beyond the critical spacing of crowding, as measured during the preceding task. After the delay period, a circle (diameter = 0.85°, width = 0.04°) matching the color and location of one memory item was displayed, prompting the observer to report that item's orientation using the mouse. After the first mouse movement was registered, a response bar appeared within the circle so that the entire response stimulus matched the target dimensions. Observers then reported the target orientation as per Experiment 1 and the next trial began. Fixation errors and breaks were dealt with as described for Experiment 1. The crowding task and memory experiment took between 1 and 1.5 h per observer. The number of trials per stimulus combination was as described in Experiment 1.
Experimental design and statistical analyses.
We pooled data across staircases in the crowded task and used the least-squares method to fit the Weibull function specified by Watson and Pelli (1983) (see Fig. 3B). We modified the function to have three free parameters, μ, σ, and g, corresponding to the midpoint of the psychometric function, the slope, and the lapse rate, respectively. We took an observer's critical spacing to be μ, which was bound between 0.85° and 8.5°, the lower of which ensured incomplete overlap of stimulus positions in the memory experiment for participants with very small crowding zones. Note that the lower bound was reached by only 2 of 19 participants and none reached the upper bound (see Fig. 3C), so this restriction is unlikely to have affected the results. The slope, σ, was bound between 0 and infinity and the lapse rate, g, was bound between 0 and 0.05 as recommended by Watson and Pelli (1983) (see also Wichmann and Hill, 2001).
All comparisons in the memory experiment were within-subjects. We performed the same analyses of report variability and model fitting as per Experiment 1, but now with the conditions “crowded” and “uncrowded” to indicate trials in which memoranda were presented within or beyond the critical spacing of crowding, respectively. Importantly, these analyses were restricted to only memory items presented at the same screen position in both conditions so that performance was matched in all aspects except for the spatial arrangement of memoranda. We further tested for a relationship between cortical spacing and short-term memory with correlational analyses. We performed both a Bayesian Pearson correlation and linear regression using JASP to determine whether memory performance, regardless of crowding level, could be predicted by critical spacing. We again restricted data to only trials in which the memory item was presented directly above fixation. For the Bayesian correlation, we used the default stretched β prior width of 1, but results of this analysis were robust to various prior widths.
Results
Experiment 1
Perceptual resolution in peripheral vision is constrained by the distance between objects in V1. As visual eccentricity increases, fewer visual neurons are available to process a constantly sized input and this relationship is approximately logarithmic (Duncan and Boynton, 2003; Pelli, 2008). This log-scaling of visual cortex causes greater perceptual interference when multiple items are presented along a radial axis from the fovea compared with a tangential axis (Toet and Levi, 1992; Pelli and Tillman, 2008). In Experiment 1, we tested whether working memory is similarly influenced by the cortical spacing between memoranda (Fig. 1A). Observers were required to remember three sequentially presented, randomly oriented bars arranged either radially or tangentially relative to the point of fixation (Fig. 1B). At the end of each sequence, observers' memory of orientation was tested for a single item indicated by a location and color probe (Fig. 1C) and responses were made by manually adjusting a response bar to match the cued item. To control for non-memory-related differences across conditions, such as visual acuity, we analyzed memory performance only for targets positioned at 10° to the right of fixation in each condition. These stimuli were matched in all aspects except their spatial context.
Figure 2 summarizes observers' report errors for memoranda presented within a radial or tangential spatial configuration. As shown in Figure 2A, the circular SD did not differ consistently between configurations. Indeed, a Bayesian paired-samples t test found weak to moderate evidence in favor of there being no difference between conditions (B01 = 2.97; t(9) = 0.45, p = 0.66). These data provide evidence against the hypothesis that short-term memory is worse when memoranda are more closely spaced in visual cortex.
Figure 2C shows the distribution of errors in each condition. The solid line shows the fit of a model in which we assumed that memory performance factors are independent of the arrangement of stimuli. This model was a better fit to the data than the model in which cortical spacing could influence memory performance [summed ΔAIC = 29.5; 8 of 10 participants; Bayesian paired-samples t test: B10 = 3.60; t(9) = 2.83, p = 0.02; maximum likelihood (ML) parameter values, mean (SE): precision = 5.21 (0.25); swaps = 0.10 (0.01); guesses = 0.14 (0.02)]. This analysis further supports a dissociation between intracortical spacing and memory performance.
Finally, we ruled out the possibility that, although memory for the central item was unaffected, cortical spacing may have influenced the flanking memoranda that were excluded from the preceding analyses. We therefore repeated the above analyses, but included only trials in which the probed item was not in the central position. We first collated data across the remaining probe locations for each condition. We again found that there was no difference in circular SD between radial and tangential conditions (B01 = 2.89; t(9) = 0.52, p = 0.62). The model in which we assume working memory is independent of cortical spacing was also the superior model (summed ΔAIC = 34.4; 9 of 10 participants).
Experiment 2
In Experiment 1, we manipulated intracortical spacing of memoranda by presenting items along a radial or tangential visual axis relative to fixation. We found positive evidence that performance was the same across conditions (Fig. 2A). These results suggest that visual short-term memory does not have the properties of visual crowding that characterize retinotopic sensory areas that encode features. It is possible, however, that the stimulus arrangements that we selected were not appropriately scaled to produce overlapping cortical mnemonic representations. To address this possibility, we conducted a second experiment in which we used a psychophysical approach to tailor intracortical spacing of memoranda individually for each participant.
We tested whether the cortical spacing of memoranda affects short-term memory by presenting items sequentially either within or beyond the critical spacing of crowding. Critical spacing was found for each participant in a perceptual crowding task in which we used an adaptive staircase to find the target–distractor distance at which they could identify a target orientation at threshold level (Fig. 3A). Results from two example participants who performed differently at this task are shown in Figure 3B. Figure 3C shows the critical spacing estimates for all observers and the median for the group. Critical spacing estimates span a nearly fourfold range and such between-subjects variability has been reported previously (Petrov and Meleshkevich, 2011; Greenwood, Szinte, Sayim, and Cavanagh, 2017). To control for between-subjects crowding variability in the memory experiment, and therefore to control for cortical spacing variability across participants, we adjusted the spatial range of memoranda in the memory experiment to be either 0.75 times (“crowded”) or 1.5 times (“uncrowded”) an observer's critical spacing.
Results from the memory experiment are shown in Figure 4, B–E. We first compared observers' report variability for the crowded and uncrowded conditions (Fig. 4B). These data are summarized as difference scores in Figure 4C. Rather than finding an effect of crowding on response SD, a Bayesian paired-samples t test found moderate evidence in favor of there being no difference between conditions (BF01 = 4.21; t(18) = 0.051, p = 0.96).
Figure 4D shows the distribution of report errors averaged across observers, with green and purple data showing crowded and uncrowded conditions, respectively. We tested whether memory performance across conditions is better described by a model in which cortical spacing influences performance or a model in which working memory is independent of cortical spacing of memoranda. The blue line in Figure 4D shows the model that is independent of cortical spacing, which was a better fit than the alternate model [summed ΔAIC = 52.46; 16 of 19 participants; Bayesian paired-samples t test: B10 = 150.2; t(18) = 2.83, p < 0.001; ML parameter values, mean (SE): precision = 5.62 (0.16); swaps = 0.04 (0.002); guesses = 0.34 (0.01)]. Although there is a higher probability density of uncrowded trials than crowded trials in the central bin (Fig. 4D; BF10 = 6.61), 16 bins were arbitrarily chosen for display purposes and there would have been evidence against such a difference between conditions had we selected, for example, 15 bins (BF10 = 0.43). The analysis of variability and model fitting above are based on raw (unbinned) data, so they are not influenced by arbitrary designation of bin size.
Figure 4E shows the results of the correlational analysis in which we investigated whether there was a relationship between observers' critical spacing and memory performance. A Bayesian correlation pairs test found moderate evidence that there is no relationship (r = 0.015, BF01 = 3.52). Similarly, a linear regression that uses critical spacing to predict report error found a slope of only 0.007 (t = 0.062, p = 0.951), indicating that there is no relationship between critical spacing and working memory performance.
As with Experiment 1, we again ruled out the possibility that cortical spacing may have influenced the flanking memoranda that were excluded from the preceding analyses. We repeated the above analyses including only trials in which the probed item was not in the central position, collapsing data across the remaining probe locations for each condition. In support of the results above, we found that there was no difference in circular SD between crowded and uncrowded conditions (B01 = 4.08; t(18) = 0.27, p = 0.79). Finally, the model in which we assume working memory is independent of cortical spacing was superior (summed ΔAIC = 51.86; 16 of 19 participants).
Discussion
We investigated whether the cortical spacing between sequentially presented memoranda affects observers' ability to hold those items in memory. In Experiment 1, we manipulated intracortical spacing by arranging memoranda either radially or tangentially relative to the fovea (Fig. 1). In Experiment 2, we tailored the intracortical spacing of memoranda to each observer by first quantifying their critical spacing of crowding (Fig. 3) and then presented memory items within or beyond this region (Fig. 4). Across both experiments, we found positive evidence that working memory performance is independent of the cortical distance between memoranda. Although the strength of evidence in each experiment was only moderate, the combined evidence across experiments is assessed by the product of the individual Bayes factors: that is, 12.5, which is substantial.
Our study provides clear evidence of a dissociation between perceptual coding and memory coding within a very short period after stimulus offset. Cortical distance in retinotopically organized visual cortex can account for a wide variety of perceptual phenomena such as visual acuity (Duncan and Boynton, 2003), shape perception (Michel et al., 2013), subjective experience of size (Schwarzkopf et al., 2011), and visual crowding (Pelli, 2008). In the present study, however, we have shown that memory representations of nonspatial features are independent of their V1 sensory representations. We know from our data that the emergence of dissociated representations occurs within the timeframe of the target duration and interstimulus interval (1 s). This time course thus places an upper bound on the transfer of retinotopic sensory representations to other neural systems involved in working memory.
This result sheds light on previous psychophysical studies that have found errors in working memory due to spatially proximal memoranda. Pertzov et al. (2014) and Ahmad et al. (2017) found that memory for nonspatial features was worse when memoranda were presented sequentially at overlapping or similar screen locations than when memoranda were presented at spatially separate screen locations. However, the timing used in these experiments would have likely produced perceptual interference sometimes referred to as “temporal crowding” (Yeshurun et al., 2015). Such perceptual interference would degrade the encoding of memoranda due to their persistent overlapping cortical representations. Indeed, the nature of errors in these previous studies of working memory are consistent with those in visual crowding paradigms with minimal working memory demands (Ester et al., 2014; Harrison and Bex, 2015, 2017). The combination of target duration and interstimulus interval used by Pertzov et al. (2014) (500 ms) thus sets a lower bound on the time required to transform a sensory signal into a memory representation.
Our results raise several important challenges for the hypothesis that working memory representations are maintained via the same sensory neurons that encode the features of memoranda (Serences, 2016). Previous studies in which a remembered feature is decoded from activity within V1 typically analyze activity within voxels corresponding to the spatial location of the memory item (Serences et al., 2009; Harrison and Tong, 2009). Because our data reveal that sensory representations are independent of memory representations, these decoding analyses must either be decoding nonsensory neurons that are nonetheless tuned to the memoranda feature dimension, which we think is unlikely, or reflect an influence from other areas. Other brain regions implicated in memory maintenance include prefrontal cortex and posterior parietal cortex (Courtne et al., 1998; Todd and Marois, 2004; Christophel et al., 2012; Bettencourt and Xu, 2016). In prefrontal cortex in particular, neurons display activity during memory delays that encodes stimulus locations and features (Goldman-Rakic, 1995; Wimmer et al., 2014; Mendoza-Halliday and Martinez-Trujillo, 2017; Murray et al., 2017; but see Lara and Wallis, 2014). These areas are part of a distributed network involved in working memory and the role of V1 in this network remains to be fully understood (D'Esposito, 2007; D'Esposito and Postle, 2015; Christophel et al., 2017).
Another alternative is that working memory is maintained via the recruitment of sensory neurons well beyond the initial sensory representation (Ester et al., 2009). According to this “neural outsourcing” proposal, the memory representation of a stimulus might be shifted to neurons that normally encode sensory stimulation in some other part of the visual field. However, it has yet to be clarified how visual features with overlapping sensory representations are allocated to other sensory regions in a way that prevents memory interference and how a mapping is maintained between outsourced representations and their original locations in the visual field.
Bays (2014) recently proposed a neural resource model of working memory based on population coding, which can account for changes in memory precision as a function of the number of memoranda. A key feature of this model is that a fixed amount of neural activity (i.e., spiking) must be shared among all memory items. Increasing set size, therefore, decreases the neural resource available for each item, resulting in a loss of memory precision. The property of maintaining a fixed level of population activity is termed normalization: it has been described as a canonical neural computation, which is implemented in many different neural systems using varied mechanisms (Carandini and Heeger, 2011). To reproduce observed effects of set size accurately, the normalization in the model must operate globally; that is, not be limited to particular regions of the visual field or particular stimulus feature values (Bays, 2015). The present results are in agreement with this in that they confirm that there is no cost of spatial proximity of memoranda as might be expected from a purely local form of normalization.
Neurophysiological evidence consistent with global normalization has been found in prefrontal and posterior parietal cortices, areas that have been implicated as playing an important role in working memory maintenance (for review, see Bays, 2015). Although inspired by properties of visual neurons, the neural resource model is agnostic as to the neural locus of working memory representations because population coding is a common mechanism of representation observed throughout the brain (Pouget, Dayan, and Zemel, 2000), including prefrontal cortex (Wimmer et al., 2014; Murray et al., 2017). Nonetheless, one possible interpretation consistent with the present findings is that, in the case of visual working memory representations, normalization occurs within networks in which neurons are not strictly topographically organized.
Although neural models of short-term memory can account for a broad range of human performance, we are not aware of any model that can account for our result. In a recent study, Schneegans and Bays (2017) presented strong evidence in favor of a model in which nonspatial features are combined with spatial location via a conjunctive population code. This extension of the neural resource model correctly predicted their empirical observation that, when memoranda are presented simultaneously, observers were more likely to confuse items in working memory (“swap” errors) when the cued memory item was close to distractors than when distractors were relatively distant from the cued item.
This model is also consistent with the results of Tamber-Rosenau et al. (2015), who found that the frequency of swap errors for simultaneously presented memoranda depends on the degree of perceptual crowding. Because visual crowding increases positional uncertainty (Harrison and Bex, 2017), a conjunctive code that binds spatial location with orientation will produce more swap errors under strongly crowded conditions than weakly crowded conditions, as was observed by Tamber-Rosenau et al. (2015). The Schneegans and Bays' (2017) model therefore suggests an important role of location in binding nonspatial features when items are presented simultaneously, but leaves open the question of how to account for the present findings with sequentially presented memoranda. It is possible that nonspatial features can be bound according to a conjunctive code that links features with their temporal order, but neurophysiological evidence for such a model is scarce. Accounting for the lack of spatial interactions between sequentially presented memoranda represents a challenge for future modeling efforts.
Footnotes
This work was supported by the Wellcome Trust and the National Health and Medical Research Council of Australia (C.J. Martin Fellowship APP1091257 to W.J.H.).
The authors declare no competing financial interests.
- Correspondence should be addressed to William J Harrison, Department of Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, UK. willjharri{at}gmail.com
This is an open-access article distributed under the terms of the Creative Commons Attribution License Creative Commons Attribution 4.0 International, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.