Abstract
The dorsolateral prefrontal cortex (dlPFC) is composed of multiple anatomically defined regions involved in higher-order cognitive processes, including working memory and selective attention. It is organized in an anterior–posterior global gradient where posterior regions track changes in the environment, whereas anterior regions support abstract neural representations. However, it remains unknown if such a global gradient results from a smooth gradient that spans regions or an emergent property arising from functionally distinct regions, that is, an areal gradient. Here, we recorded single neurons in the dlPFC of nonhuman primates trained to perform a memory-guided saccade task with an interfering distractor and analyzed their physiological properties along the anterior–posterior axis. We found that these physiological properties were best described by an areal gradient. Further, population analyses revealed that there is a distributed representation of spatial information across the dlPFC. Our results validate the functional boundaries between anatomically defined dlPFC regions and highlight the distributed nature of computations underlying working memory across the dlPFC.
SIGNIFICANCE STATEMENT Activity of frontal lobe regions is known to possess an anterior–posterior functional gradient. However, it is not known whether this gradient is the result of individual brain regions organized in a gradient (like a staircase), or a smooth gradient that spans regions (like a slide). Analysis of physiological properties of individual neurons in the primate frontal regions suggest that individual regions are organized as a gradient, rather than a smooth gradient. At the population level, working memory was more prominent in posterior regions, although it was also present in anterior regions. This is consistent with the functional segregation of brain regions that is also observed in other systems (i.e., the visual system).
Introduction
The concept that the cerebral cortex is parcellated into distinct interconnected brain regions is a cornerstone of our understanding of brain function. This idea was initially inspired by the observed segregation of cytoarchitectonic and functional properties across the cortical surface (Brodmann, 1909; Penfield and Jasper, 1954). Since then it has been further supported by multiple converging lines of evidence, including differences in anatomic and functional connectivity between regions (Yeo et al., 2011; Yeterian et al., 2012; Glasser et al., 2016). These anatomic and functional differences are two sides of the same coin; differences in anatomy such as inputs to the region, outputs to other regions, within-region connectivity, and intrinsic neuronal properties mold the functional properties of a specific brain region (Zylberberg and Strowbridge, 2017).
The dorsolateral prefrontal cortex (dlPFC; Fig. 1A) has been parcellated into separate brain regions based on anatomic and functional properties (Kaping et al., 2011; Yeterian et al., 2012). It is involved in cognitive functions such as working memory, selective attention, and motor planning (Miller and Cohen, 2001). Substantial evidence supports the notion of a global functional gradient along the anterior–posterior (AP) axis of the dlPFC (Koechlin et al., 2003; Petrides, 2005; Badre and D'Esposito, 2007, 2009; Riley et al., 2017, 2018). This functional gradient appears to reflect a functional hierarchy, with anterior regions supporting more abstract neural representations and complex action rules and posterior regions tracking moment-to-moment changes in the environment and the organism (Badre and D'Esposito, 2009; Constantinidis and Qi, 2018).
Such global functional gradients in the dlPFC could be the result of a smooth gradient of functional properties along the anterior–posterior axis (Fig. 1B). If this were true, then the segregation of dlPFC into different regions would be misleading, as a smooth gradient would instead suggest that the dlPFC acts as a single functional region, despite its heterogeneous anatomy. Alternatively, these global functional gradients could be the result of an emergent property from the organized arrangement of functionally distinct areas separated by functional boundaries, or an areal gradient (Fig. 1C). If this were the case, then the segregation of dlPFC into different functional regions would be justified.
On one hand, a smooth gradient could be expected based on the observation of what appears to be a smooth functional gradient along its anterior–posterior axis (Riley et al., 2017, 2018; Xu et al., 2022), as well as the high mixture of selectivities observed across dlPFC areas, which lack a clear topography (Rao et al., 1997; Mante et al., 2013; Rigotti et al., 2013). On the other hand, an areal gradient could be expected based on the observation of marked functional differences between dlPFC areas (Kaping et al., 2011), as well as the known anatomic differences between regions (Kaping et al., 2011; Petrides et al., 2012; Yeterian et al., 2012; Markov et al., 2014; Barbas, 2015). Anatomically, the anterior–posterior axis of the dlPFC can be subdivided into three distinct areas—8A, 9/46, and 46, from posterior to anterior—defined by specific cytoarchitectonic and connectivity patterns (Petrides et al., 2012; Yeterian et al., 2012; Markov et al., 2014). For example, area 9/46 receives input from the insula, whereas the adjacent area 46 does not (Yeterian et al., 2012). Furthermore, the principal sulcus appears to divide the regions into distinct dorsal and ventral portions with different connectivity patterns. For instance, the posterior cingulate cortex projects to the dorsal portion of these regions, whereas it does not project to their ventral portions (Yeterian et al., 2012). Such input differences may imply functional differences between these anatomically defined regions and support an areal gradient across the dlPFC. Crucially, determining whether the anterior–posterior functional global gradient in dlPFC is the result of a smooth gradient or an areal gradient will help constrain models of the prefrontal cortex, guide electrode positioning in future studies, as well as contextualize the conclusions of prior studies.
To assess whether the functional gradient across the dlPFC is the result of a smooth or an areal gradient, we recorded the activity of neurons in the dlPFC of two monkeys while they performed a working memory task that required ignoring a distractor shown midway through the delay period (Fig. 2A), that is, between delays 1 and 2. We assessed the task-relevant response properties of single neurons and the information contained in the neural populations they form as a function of their position along the anterior–posterior axis in individual monkeys (Fig. 2). We applied principal component analysis (PCA) on task-relevant response properties (e.g., proportion of selective cells during the stimulus or delay period, see below, Materials and Methods), and assessed whether the different PCs were better explained by a one-segment fit (which would support a smooth gradient), or a two-segment fit (which would support an areal gradient). We found there was an anterior–posterior global gradient in both animals (Fig. 3A). Crucially, we found that these global gradients were better explained by areal gradients compared with smooth gradients. These areal gradients across PCs capture a trend across response properties as the PC loadings were distributed across them. To determine whether these areal gradients were reflected at the population level we performed a population decoding analysis of task-relevant information (e.g., visual stimulus information and delay period memory information) along the anterior–posterior axis. We asked whether decoding accuracy differences within each region were smaller than between regions. We found that the presence of global gradients was recapitulated at the population level; however, the distinction between areal and smooth gradients were not observed at this coarser scale. Together, our results support the notion that global gradients are better described by areal gradients of functional properties. Additionally, regions of dlPFC maintained task-related information in a distributed manner during the delay period.
Materials and Methods
Animals and single-unit recordings
We used two male adult macaques (Macaca fascicularis) in the experiments. All animal procedures were approved by and conducted in compliance with the standards of the Agri-Food and Veterinary Authority of Singapore and the Singapore Health Services Institutional Animal Care and Use Committee (SingHealth IACUC #2012/SHS/757). Procedures also conformed to the recommendations described in Guidelines for the Care and Use of Mammals in Neuroscience and Behavioral Research (Van Sluyters and Obernier, 2003).
We first implanted a titanium head post (Crist Instrument), followed by intracortical microelectrode arrays (Microprobes) in the left frontal cortex. In Monkey 1, we implanted 6 arrays of 16 electrodes, 1 array of 32 electrodes, and 2 arrays of 32 electrodes for a total of 192 electrodes. In Monkey 2, we implanted 1 array of 16 electrodes, 2 arrays of 32 electrodes, and 2 arrays of 16 electrodes in the cortex for a total of 112 electrodes. In Monkey 3, we implanted four arrays of 32 electrodes for a total of 128 electrodes. Parthasarathy et al. (2017) provides additional details on the surgery.
Electrophysiology
Neural signals were acquired using a 128-channel and a 256-channel Plexon OmniPlex system at a sampling rate of 40 kHz. Wideband signals were bandpass filtered between 250 and 10 000 Hz. Subsequently, spikes were detected using an automated Hidden Markov model–based algorithm for each channel (Herbst et al., 2008). Eye positions were obtained with an infrared-based eye-tracking device from SR Research (EyeLink 1000 Plus). The behavioral task was designed on a stand-alone personal computer (stimulus computer) using the Psychophysics Toolbox in MATLAB (MathWorks; Brainard, 1997). To align the neural and behavioral activity (trial epochs and eye data) for downstream data analysis, we used strobe words denoting trial epochs and performance (rewarded or failure) during the trial. These strobe words were generated on the stimulus computer and sent to the Plexon and EyeLink computers via the parallel port.
Delayed-saccade working memory task with intervening distractor
Each trial started with a mandatory period (500 ms) during which the animal fixated on a white circle at the center of the screen. While continuing to fixate, the animal was presented with a target (a red square) for 300 ms at any one of eight locations in a 3 × 3 grid (four locations in the corners of a 3 × 3 grid for the third animal). The center square of the 3 × 3 grid contained the fixation spot and was not used. The presentation of the target was followed by a delay of 1000 ms, during which the animal was expected to maintain fixation on the white circle at the center. At the end of this delay, a distractor (a green square) was presented for 300 ms at any one of the seven locations (other than where the target was presented). This was again followed by a delay of 1,000 ms. At the end of the second delay, the animal was given a go cue (the disappearance of the fixation spot) to make a saccade toward the target location presented earlier in the trial. Saccades to the target location within 150 ms and continued fixation at the saccade location for 200 ms was considered a correct trial. An illustration of the task is shown in Figure 2A. Because of a behavioral bias in one of the animals, the target was presented in only seven of the eight target locations in that animal (we excluded the location at the bottom right).
Estimation of anatomic parcellations for each monkey
We estimated the anatomic boundaries of each region from a magnetic resonance imaging atlas of the cynomolgus monkey brain in a native millimeter space (Frey et al., 2011). The anatomic boundaries identified from the atlas had the following values: 8Ad-9/46d (2.8 mm), 9/46d-46d (9.5 mm). Neurons anterior to 9/46d were not used as they were only recorded in one animal.
Estimation of electrode positions
We calculated the anterior–posterior location of each electrode using photographs taken during the implantation surgeries. We used the electrode arrays, which have a fixed size, to convert pixels to millimeters. The posterior tip of the principal sulcus was used as the reference so that positive values denote more anterior positions, and negative values denote more posterior positions (Fig. 2B,D).
Calculating functional measures per electrode
The measures used to derive the principal components are as follows: target selectivity F statistic, distractor selectivity F statistic, response latency, receptive field size, stimulus selectivity index, distractor filtering, delay 1 selectivity F statistic, delay 2 selectivity F statistic, nonlinear mixed-selective F statistic, delay 1 memory field size, delay 2 memory field size, delay 1 selectivity index, delay 2 selectivity index.
To determine whether a neuron was selective during the target, distractor, and/or delay periods, we averaged the spike count across the entire period (target period, 0–0.3 s; delay 1, 0.3–1.3 s; distractor. 1.3–1.6 s; delay 2, 1.6–2.6 s) and ran a one-way ANOVA across locations, with the F statistic test value used as a measure of selectivity. To calculate response latencies, we identified the earliest responsive bin for the preferred location for each neuron across the entire trial in 100 ms bins with 50 ms overlap (significantly above the prestimulus baseline of −300 ms to 0 ms; t test, p < 0.05). If there were no preferred locations, we identified the earliest responsive bin across all locations. To calculate receptive field size and memory field sizes, we counted the number of target locations where activity during the period analyzed (target, delay 1 or delay 2) was significantly different from the prestimulus baseline (−300 to 0 ms). To calculate selectivity indices during the target presentation or delay periods, we calculated the index as (max − min)/(max + min), where max is the average firing rate for the target location with the highest firing rate (maximum) and min is the corresponding location with the lowest firing rate (minimum). To calculate distractor filtering, we obtained the ratio between the average firing rate for the preferred location during the target presentation period to the firing rate for the same distractor location during the distractor presentation period. To determine whether a neuron had nonlinear mixed selectivity we measured the average activity of the last 500 ms of delay 1 and delay 2 and conducted a two-way ANOVA, using target location and epoch (delay 1 or delay 2) as the two factors.
These were then z-scored and used to derive the principal components that had a cumulative explained variance of at least 75%.
Fitting the linear (one segment) and piecewise (two segment) models
For each of the 13 functional measures assessed, single-neuron data were first smoothed along the AP axis using a 0.5 mm window, with 0.17 mm overlap (i.e., one-third of window size). To describe the trend along the anterior–posterior axis, we first fitted a one-segment and two-segment robust linear regression model to the smoothed dataset to reduce the bias from outlier data points (MATLAB function fitlm(), with the Robust parameter set to on and with the bisquare weighting function). To quantify the goodness of fit of a given functional measure, we calculated an adjusted R2 for the linear fit of the actual dataset, which penalizes the number of parameters used in the model, and thus penalizes the increasing number of segments fitted per model as follows:
For the two-segment models, we first defined the possible break locations, which were ±1.5 mm from the estimated anatomic boundaries (between areas 8Ad and 9/46d). This ±1.5 mm area was subdivided into 30 nonoverlapping 0.1 mm steps, and we fitted a line to the left and right of these breaks using robust linear regression. Of the 30 models, the model with the lowest root mean squared error across the PC fits was considered to be optimal, and thus its break location was used as the functional boundary for that functional measure.
Testing for the presence of global gradients
To assess whether there was a global gradient along the anterior–posterior axis, we tested whether the adjusted R2 of the one-segment fit was significant compared with a shuffled distribution of the adjusted R2. Specifically, we obtained a shuffled distribution by shuffling the dataset (after smoothing) along the AP axis and fitting the same robust linear model to these shuffled datasets for 1000 iterations. If the adjusted R2 of the actual fit was larger than the 95th percentile of the shuffled distribution, we considered it significant and determined that a global gradient was present (displayed in figures as an adjusted R2 in bold font above the one-segment models). We corrected for multiple comparisons with the false discovery rate Benjamini–Hochberg method, separately for the dlPFC of each monkey as they were analyzed for gradients independently.
Estimation of functional parcellations for each monkey
To identify the functional parcellation boundaries for each animal, we split electrodes into the 8Ad and 9/46d regions with the break point established above and further split each region into anterior and posterior parcels within 8Ad (8Ad-I and 8Ad-II) and 9/46d (9/46d-I and 9/46d-II) to approximately obtain equal numbers of neurons in each parcel. These neurons were then used in the population decoding analyses.
Cross-temporal decoding analysis
We pooled all neurons per region (using the functional boundaries) and subsampled to the minimum number of neurons across regions (n = 97, minimum in area 8Ad-II). We used data at each time point (100 ms bins with 50 ms overlap) to train a decoder using linear discriminant analysis (MATLAB function classify). We tested the decoder on data from all 100 ms time bins in the trial, as described in Parthasarathy et al. (2017). When decoding data in the full space, we denoised the training and testing data using PCA at every time point by reconstructing the data with the top n principal components that explained at least 90% of the variance. The target locations were predicted by training and testing the decoder on datasets from equivalent time points. The performance of the decoder, a measure of the information about the target location in the population activity, was computed at each time point as a percentage of test trials in which the target location was predicted correctly. This process was repeated 1000 times, with different subsets of correct trials used to constitute the training and test sets. We used 1500 trials for the training set and 100 trials for the test set.
Data availability
Data is available upon request. Further inquiries may be directed to the corresponding author.
Results
We recorded neurons from the dlPFC of two adult macaque monkeys that were trained to perform a delayed saccade task with an intervening distractor (Fig. 2A). The monkeys achieved a behavioral performance of ∼75% correct trials (74% for Monkey 1, 77% for Monkey 2, across four recording sessions per monkey). Our analyses focused on physiological properties during the target and distractor presentation periods (300 ms each; Fig. 2A, red and green periods, respectively) and the last 500 ms of delay 1 and delay 2 periods (Fig. 2A, yellow and orange periods, respectively). We chose to analyze the last 500 ms of the delay periods because we previously showed that the activity during the first 500 ms of the first delay had strong temporal dynamics, whereas the activity during the last 500 ms was more stable (Parthasarathy et al., 2017).
To examine gradients along the anterior–posterior axis, we first estimated the AP position of each microelectrode (and hence, neuron) from images taken during surgery (Fig. 2B). The AP position of each neuron was calculated with reference to the posterior tip of the principal sulcus, where zero denotes the posterior tip of the principal sulcus, negative values represent locations posterior to the tip, and positive values represent locations anterior to the tip. We estimated the anatomic boundaries of each region (areas 8A, 9/46, 46) from a macaque MRI atlas (Frey et al., 2011; Fig. 2C). Combining the AP positions of electrodes and the anatomic boundaries, we parcellated electrodes into subregions, separately for each monkey (Fig. 2D). Based on estimated anatomic parcellations, our electrodes spanned two regions of the dorsal dlPFC, 8Ad and 9/46d (Fig. 3B). We recorded a total of 427 neurons from the dlPFC of the two monkeys, comprising areas 8Ad and 9/46d (M1, 74 neurons; M2, 353 neurons; Table 1). We recorded neurons from the more anterior area 46d in only one animal (M2). These neurons are displayed in Figure 3 (columns IV and V), but were not included in any neuron counts or analyses here.
Functional organization of single neurons along the anterior–posterior axis
We sought to assess whether the functional properties of dlPFC neurons were organized as a global gradient along the anterior–posterior axis, and if so, whether this global gradient was better explained by a smooth or areal gradient. We first derived response properties of single neurons during the stimulus presentation period (proportion of target and distractor selective cells, response latency, receptive field size, stimulus selectivity index, and the degree of distractor filtering) and during each delay period (last 500 ms of a 1000 ms delay; the proportion of target selective and nonlinear mixed selective cells, working memory field size, and the degree of selectivity for each delay). A subset of these response properties were chosen to align with previous studies that have shown a global gradient (Riley et al., 2017, 2018), and we expanded on these by adding similar properties specific to our behavioral task (measures related to the distractor and target encoding during delay periods). To extract the relevant axes of variability and to avoid biasing the results with correlated physiological measures, we performed a PCA on these measures and extracted the PCs that explained at least 75% of variance in both animals (M1, ncomponents = 5, explained variance = 82.3%; M2, ncomponents = 5, explained variance = 78.3%). We then plotted the projections onto each PC on the y-axis, and their anterior–posterior anatomic position was plotted on the x-axis, regardless of their dorsoventral position within the region (Fig. 2E). The data were spatially smoothed using a 0.5 mm sliding window with a third of overlap between windows.
To infer the existence of a global gradient of physiological properties we fitted a one-segment model to the different PCs. The model fit was considered significantly different from chance if its R2 value was significantly larger than the R2 of the same data shuffled along the anterior–posterior axis.
To determine whether the global gradient was better explained by a smooth or an areal gradient, we fitted a two-segment robust linear model to the data (Ryan and Porth, 2007; Fig. 2E). For the two-segment models, each discontinuation or break point corresponded to an estimated functional boundary based on the anatomic parcellations extracted from a macaque MRI atlas (Frey et al., 2011; Fig. 2E, right). We allowed the functional boundaries to deviate ±1.5 mm from these estimated anatomic boundaries to allow for interanimal variations (Xu et al., 2018). This was done separately for each monkey. The goodness of fit of the two-segment model was measured using the adjusted R2, which contains a penalty term for the increased number of parameters used in the two-segment models (Ryan and Porth, 2007). A model fit was considered significantly different from chance if the adjusted R2 of the model was significantly better than the adjusted R2 of the same data shuffled along the anterior–posterior axis. We then compared the one- and two-segment models to determine whether either fit the data better than the other. To compare the models we applied the Akaike Information Criterion (AIC). An absolute difference in AIC (ΔAICc, AIC1-seg – AIC2-seg) value ≥4 indicates that one model is better than the other (Burnham and Anderson, 2004). A ΔAICc value <4 would not allow us to determine which model fits the data better. A significantly better one-segment fit compared with a two-segment fit was taken as evidence that the global gradient was better explained by a smooth, compared with an areal, gradient (and vice versa).
Both monkeys showed a significant one-segment fit in the first PC, providing evidence for the existence of a global gradient of physiological properties (Fig. 3A). Importantly, PC1 for both monkeys contains contributions from most of the physiological measures included, so the global gradient cannot be attributed to a small subset of physiological properties (Fig. 3A). Although both one- and two-segment model fits were significantly better than chance, using PC1 we were not able to determine whether the global gradient was better explained by a smooth or an areal gradient (ΔAICc <4).
Neither monkey showed a significant one-segment fit for the rest of the PCs (2–5). However, they both showed significant two-segment fits for a subset of the PCs (M1, PC2 and PC4; M2, PC3; Fig. 3B–E). In monkey 1, the model comparison analysis showed that the two-segment fits of PC2 and PC4 are better than the one-segment fit (ΔAICc > 4). This was not observed for PC3 of monkey 2. The physiological measures that contribute the most to PC2 are receptive field size during stimulus presentation and the selectivity indices for delays 1 and 2. Those that contribute the most to PC4 of monkey 1 and PC3 of monkey 2 (which, based on the contribution graphs, appear to correspond to an equivalent axis in both monkeys) are target and distractor selectivity measures (target selectivity F statistic, distractor selectivity F statistic, stimulus selectivity index, distractor filtering) and delay period selectivity measures (delay 1 selectivity F statistic, delay 2 selectivity F statistic). These observations suggest that metrics that contribute to the encoding and maintenance of target information contribute the most to the differences between areas. Overall, these results support the interpretation that the global gradient of functional properties is better explained by an areal gradient rather than a smooth gradient.
Population-level functional differences along the anterior–posterior axis
To assess whether the dlPFC regions exhibit areal differences at the population level, we looked at task-relevant information across the AP axis. Specifically, we asked whether inter-regional information differences (between 8Ad and 9/46d) were greater than intraregional differences (within 8Ad and within 9/46d). To do so, we first parcellated neurons to region 8Ad or 9/46d based on the estimated functional boundaries from the areal models above (Fig. 2F,G; see above, Materials and Methods). We then split each region into two subregions of approximately equal neurons, that is, anterior 8Ad-I and posterior 8Ad-II for M1, and the same for 9/46d. (Neuron counts per region are shown in Table 1.)
To assess information at the population level, we used linear discriminant analysis to generate cross-temporal decoding plots, where the y-axis shows the time bins used to train the decoders, and the x-axis shows the time bins used to test the decoders (Fig. 4A). For constructing the decoders, we used 100 ms time bins with 50 ms overlap. Here, we use the term “information” to refer to the classification performance of the decoders. In each plot, we decoded the target or distractor location (Fig. 4). Using these plots, we assessed the following aspects of the information contained in each region: (1) information quantity within a period, (2) code stability within a period, and (3) code stability across periods. For target and distractor presentation periods, we analyzed the first two measures. For delays 1 and 2, we analyzed all three measures.
To compare information quantity and stability across regions, we matched the number of locations decoded and the number of cells used in each region (nlocations = 4, nmin. neurons = 97, nneurons per region = 110, 97, 108, 112). As area 8Ad-II had the lowest number of neurons, we used all 97 neurons in that area and randomly subsampled 97 neurons for other regions per decoding iteration (n = 1000). As such, the observed variance is from subsampling trials for the decoding analysis.
Target information quantity and code stability
To assess the behavioral relevance of the activity of different regions, we compared the quantity of information decoded and the stability of the code. Behavioral relevance would be reflected as a significant difference between the decoding performance.
From the cross-temporal decoding plots for correct trials (Fig. 4A) we first calculated the average target information, which was the average decoding performance of the 300 ms of the target presentation period. We found significant target information compared with chance in both 8Ads, with 8Ad-I having significantly more information than all other areas (Fig. 4B). The information was absent in the anterior region 9/46d (I and II).
We then looked at delay period information (the average decoding performance of the last 500 ms of delay 1 and delay 2). We averaged across both delays as they were not significantly different from each other. (The 2.5th to 97.5th percentile of the distributions of delay 1 and delay 2 values overlapped.) We found significant information in the delay period in all parcels except the anterior-most 9/46d-II, which had significantly less information than all other parcels (Fig. 4C).
We then assessed the degree to which the neural code was stable or dynamic across the trial. To do so, we calculated the within-period code stability, defined as the ratio of the square sans diagonal to the diagonal of the 300 ms of the target presentation period (Fig. 4A) or the last 500 ms of both delay periods (Fig. 4A). We averaged across both delays as they were not significantly different from each other (95th percentile overlap test, p < 0.01). A stable code would lead to a ratio close to one, whereas a dynamic code would lead to a ratio close to zero (i.e., time-varying code). We found that the code was stable across both regions, during both the target presentation and delay periods (Fig. 4D,E).
Finally, we assessed code morphing, the phenomenon where there is a change in the coding scheme used across delay 1 and 2 after an intervening distractor was presented (Parthasarathy et al., 2017). Code morphing is present when the neuronal code in delay 1 can be used to predict the target location in delay 1 but cannot be used to predict the target location in delay 2, and vice versa. To calculate the across-delay code stability, we calculated the ratio between the performance of a decoder trained and tested in the last 500 ms of delay 1 and a decoder trained in delay 1 but tested in delay 2. We did the same for delay 2. A code that that does not morph and is perfectly stable across delays would lead to a delay 1 and delay 2 code-morphing value of 1, whereas a code that morphs would have a value larger than 1 for both delay 1 and delay 2. We combined the results for delays 1 and 2 as they were not significantly different. We found code morphing in both regions, such that a code used during a given delay was not generalizable to the other delay (Fig. 4F).
Overall, there is a trend of greater information in posterior regions and does not distinguish between and areal or smooth gradient at this coarser scale. There also appears to be code morphing across regions, implying that the code used to represent the target changes after an intervening distractor. This suggests that task-relevant target information is stable and distributed across the AP axis of the dorsal dlPFC (except 9/46d-II), and the code morphs similarly across regions.
Discussion
Here, we show that there exists a global gradient in the dlPFC and is the result of functionally distinct areas organized as an areal gradient rather than a smooth gradient along the anterior–posterior axis. We showed this by analyzing functional properties across the anterior–posterior axis at a fine scale of single-neuron properties and coarse-scale population decoding of task-relevant information. We found that at the single-neuron level, areal gradients best describe the major principal components derived from physiological properties of single neurons along the anterior–posterior axis (Fig. 3). We also showed that at the level of neuronal populations, global gradients were similarly observed, as stimulus and memory information decreased from posterior to anterior regions. However, at this coarser level, were unable to distinguish between smooth and areal gradients (Fig. 4).
Global functional gradients are the result of functionally distinct areas organized as a gradient
Previous studies have described the existence of global functional gradients in the dlPFC (Riley et al, 2017, 2018). In line with this, we found global functional gradients, including posterior to anterior decreases in selectivity and memory field sizes during stimulus and delay periods, distractor filtering, mixed selectivity, and increases in latencies. Although it is challenging to match our microelectrode positions to those reported in the relevant studies by Riley et al. (2017, 2018), we estimate that our recording electrodes in the dlPFC fall within what they refer to as the “dorsal dlPFC” (i.e., regions dorsal to the principal sulcus, as well as the ventral lip of the principal sulcus). With that consideration, the global gradients we report here are largely consistent with those reported previously (Riley et al., 2017, 2018).
Here, we extend these studies by showing that these global gradients are the result of functionally distinct areas organized as an areal gradient rather than a smooth gradient of functional properties. To our knowledge, ours is the first study to explicitly assess whether the functional heterogeneity observed at the cellular level in the dlPFC can be better explained by a smooth gradient of functional properties, or by functionally distinct areas organized as a gradient. Our results support the latter interpretation.
Distributed information across dlPFC regions during spatial working memory
The following lines of evidence suggest that the posterior dlPFC regions play a special role in the maintenance of spatial working memory information. We found that although all areas (except 9/46d-II) had significant target information, the posterior-most 8Ad had significantly greater target information than all anterior parcels during the stimulus presentation period (8Ad-II, 9/46-I, and 9/46-II; Fig. 4B). During the delay, all three posterior parcels had significantly more information than the anterior–most 9/46-II. Together, these results support the notion of a distributed representation of task-relevant information across the posterior parcels of dlPFC, which may increase the robustness of relevant information in the face of intervening distractors (Mejías and Wang, 2022).
Conclusion
We found that functionally dissociable areas in the dlPFC are organized along the anterior–posterior axis in a functional gradient. Such organization has been observed in other brain systems, such as the visual system, where individual regions (V1, V2, V4, IT), which are anatomically and functionally distinct from each other, are arranged along an anterior–posterior functional gradient, with more anterior regions having larger receptive fields and more complex response selectivities (Hubel and Wiesel, 1959; Lerner et al., 2001; Tsao et al., 2006; Freud et al., 2017; O'Rawe and Leung, 2020). The ability to record large numbers of neurons simultaneously across multiple brain regions in behaving animals provides an important resource for systems neuroscience research (Mitz et al., 2017; Dotson et al., 2018). Crucially, our results underscore the importance of respecting the functional boundaries between regions when analyzing these large datasets as it is common practice to group cells from multiple regions for analyses (Parthasarathy et al., 2017, 2019; Bartolo et al., 2020; Tang et al., 2020). These results will also be important for researchers using artificial neural networks to model cognitive processes because a single prefrontal module, usually modeled as a randomly connected recurrent neural network or as a bump-attractor network, may not reflect functional distinctions and complex interactions between different prefrontal areas (Mante et al., 2013; Wimmer et al., 2014; Parthasarathy et al., 2019; Yang et al., 2019; Tang et al., 2020).
Footnotes
This work was supported by the Ministry of Education Singapore Grants MOE-T2EP30121-0010 (C.L.) and MOE2017-T3-1-002 (C.L.). We thank Shih-Cheng Yen for comments on the manuscript.
The authors declare no competing financial interests.
- Correspondence should be addressed to Camilo Libedinsky at camilo{at}nus.edu.sg