Abstract
Active maintenance of rules, like other executive functions, is often thought to be the domain of a discrete executive system. An alternative view is that rule maintenance is a broadly distributed function relying on widespread cortical and subcortical circuits. Tentative evidence supporting this view comes from research showing some rule selectivity in the orbitofrontal cortex and dorsal striatum. We recorded in these regions and in the ventral striatum, which has not been associated previously with rule representation, as macaques performed a Wisconsin Card Sorting Task. We found robust encoding of rule category (color vs shape) and rule identity (six possible rules) in all three regions. Rule identity modulated responses to potential choice targets, suggesting that rule information guides behavior by highlighting choice targets. The effects that we observed were not explained by differences in behavioral performance across rules and thus cannot be attributed to reward expectation. Our results suggest that rule maintenance and rule-guided selection of options are distributed processes and provide new insight into orbital and striatal contributions to executive control.
SIGNIFICANCE STATEMENT Rule maintenance, an important executive function, is generally thought to rely on dorsolateral brain regions. In this study, we examined activity of single neurons in orbitofrontal cortex and in ventral and dorsal striatum of macaques in a Wisconsin Card Sorting Task. Neurons in all three areas encoded rules and rule categories robustly. Rule identity also affected neural responses to potential choice options, suggesting that stored information is used to influence decisions. These results endorse the hypothesis that rule maintenance is a broadly distributed mental operation.
Introduction
When we respond to events in our world, our actions depend on the context in which they appear. For example, at a restaurant, it may be appropriate to grab some french fries from one's spouse's plate, but not to grab the same french fries from a stranger's plate. Rule identification and maintenance are essential parts of executive control and are associated with the brain's executive control system (Wallis et al., 2001; Bunge et al., 2003; Wallis and Miller, 2003; Mansouri et al., 2006; Bunge and Wallis, 2007; Buckley et al., 2009; Miller and Wallis, 2009; Stoet and Snyder, 2009; Yamada et al., 2010; Goodwin et al., 2012). Successful rule maintenance and switching are critical for healthy decision making, and are compromised in diseases such as drug addiction (Stalnaker et al., 2009; van der Plas et al., 2009; Woicik et al., 2011).
According to modular views of executive function, rule maintenance ought to be associated with a single brain area that is relatively specialized for that function. A great deal of research indicates that the prefrontal cortex (PFC), especially the dorsolateral PFC (DLPFC), may serve that function (Banich et al., 2000; MacDonald et al., 2000; Wallis et al., 2001; Bunge et al., 2003; Wallis and Miller, 2003; Mansouri et al., 2006; Goodwin et al., 2012; Hussein et al., 2014; Mian et al., 2014). An alternative possibility is that rule maintenance is widely distributed across many brain areas, including regions outside of the PFC. According to this distributed view, rules are maintained in the form of changes to the input–output mappings of neurons throughout the brain, in accordance with their roles in guiding behavior (Wilson et al., 2014).
The first prediction of this distributed view is that neurons outside of dorsal prefrontal regions will show rule sensitivity. The second is that rule maintenance should be associated with systematic changes in neural responses to task stimuli. Some evidence supports the idea that rules are encoded in orbitofrontal cortex (OFC) and dorsal striatum (DS) (Wallis et al., 2001; Wallis and Miller, 2003; Yamada et al., 2010; Tsujimoto et al., 2011; Bissonette and Roesch, 2015). However, the extent and robustness of rule encoding in these regions is not known, nor is the relationship between rule maintenance and responsiveness. In addition, whereas some evidence supports a role for the ventral striatum (VS) in rule switching (Floresco et al., 2006; Sleezer and Hayden, 2016), its role in rule maintenance is not known. Intriguingly, recent discoveries support the idea that OFC and VS, which are often thought to be specialized reward structures, may have broader executive roles (Wilson et al., 2014; Floresco, 2015).
We recorded neuronal activity in OFC, VS, and DS while macaques performed a Wisconsin Card Sorting Task (WCST). This task required monkeys to learn and then follow one of six rules in two possible categories (three color rules and three shape rules). We found that neurons in all three regions encoded individual rules and rule categories throughout the task. Moreover, acquisition of rule-related information altered neural responses to stimuli during the evaluation of choice options, thus implicating rule encoding in these regions in target selection. These findings endorse the idea that rule maintenance is a distributed function and are consistent with the idea that working memory involves, in part, systematic changes in responsiveness of stimulus-sensitive neurons.
Materials and Methods
Surgical procedures.
All animal procedures were approved by the University Committee on Animal Resources at the University of Rochester and were designed and conducted in compliance with the Public Health Service's Guide for the Care and Use of Animals. Two male rhesus macaques (Macaca mulatta) served as subjects. Standard electrophysiological techniques described previously (Strait et al., 2014) were used. A small prosthesis for holding the head was used. Animals were habituated to laboratory conditions and then trained to perform oculomotor tasks for liquid reward. A Cilux recording chamber (Crist Instruments) was placed over the striatum. Position was verified by magnetic resonance imaging with the aid of a Brainsight system (Rogue Research). Animals received appropriate analgesics and antibiotics after all procedures. Throughout both behavioral and physiological recording sessions, the chamber was kept sterile with regular antibiotic washes and sealed with sterile caps.
Recording sites.
OFC, VS, and DS were approached through a standard recording grid (Crist Instruments) using the standard atlas for all area definitions (Paxinos et al., 2000). OFC was defined as the coronal planes situated between 29 and 36 mm rostral to the interaural plane, the horizontal planes situated between 0 and 9 mm from the ventral surface, and lateral to the medial orbital sulcus. Recordings were made from Area 13 m (Ongür and Price, 2000) and from VS and DS according to that atlas. VS was defined as lying within the coronal planes situated between 28.02 and 20.66 mm rostral to the interaural plane, the horizontal planes situated between 0 and 8.01 mm from the ventral surface of striatum, and the sagittal planes between 0 and 8.69 mm from the medial wall. DS was defined as the regions of striatum dorsal to the VS within the same coronal planes (Fig. 1C). Recordings were made from a central region within these zones. The majority of our VS recording sites were located in a region corresponding to the core of the nucleus accumbens. Recording locations were confirmed before each recording session using our Brainsight system with structural magnetic resonance images taken before the experiment. Neuroimaging was performed at the Rochester Center for Brain Imaging on a Siemens 3T MAGNETOM Trio Tim using 0.5 mm voxels. Recording locations were confirmed by listening for characteristic sounds of white and gray matter during recording, which in all cases matched the loci indicated by the Brainsight system. The Brainsight system typically offers an error of <1 mm in the horizontal plane and <2 mm in the z-direction.
Electrophysiological techniques.
Single electrodes (Frederick Haer; impedance range 0.8–4 MΩ) were lowered using a microdrive (NAN Instruments) until waveforms between 1 and 3 neuron(s) were isolated. Individual action potentials were isolated on a Plexon system. Neurons were selected for study solely on the basis of the quality of isolation; they were never preselected based on task-related response properties.
Eye-tracking and reward delivery.
Eye position was sampled at 1000 Hz by an infrared eye-monitoring camera system (SR Research). Stimuli were controlled by a computer running MATLAB (The MathWorks) with Psychtoolbox and Eyelink Toolbox. Visual stimuli were presented on a computer monitor placed 57 cm from the animal and centered on its eyes. A standard solenoid valve controlled the duration of juice delivery. The relationship between solenoid open time and juice volume was established and confirmed before, during, and after recording.
Behavioral task.
The task described here has been described previously (Sleezer and Hayden, 2016). Monkeys performed an analog of the WCST based on that developed by Moore et al. (2005). This task uses stimuli that are nearly identical to those commonly used in human versions of the WCST, with two dimensions (color and shape) and six specific rules (three shapes: circle, star, and triangle, and three colors: cyan, magenta, and yellow; Fig. 1A). On each trial, three stimuli were presented asynchronously, with each stimulus presented at the top, bottom left, or bottom right of the screen. The color, shape, position, and order of stimuli were fully randomized. Each stimulus was presented for 400 ms and was followed by a 600 ms blank period. Monkeys were free to fixate upon the stimuli when they appeared.
Monkeys made at least one saccade to presented stimuli the majority of the time (64.40% for the presentation of stimulus one, 63.18% for the presentation of stimulus two, and 55.99% for the presentation of stimulus three). However, because monkeys did not always look at presented stimuli, all analysis of neural activity during the presentation period was restricted to trials in which monkeys looked at the stimuli. After the stimuli were presented separately, all three stimuli appeared simultaneously with a central fixation spot in the middle of the stimuli. The monkey was required to fixate on the central dot for 100 ms and then indicate its choice by shifting gaze to its preferred stimulus and maintaining fixation on it for 250 ms. Failure to maintain gaze for 250 ms did not lead to the end of the trial, but instead returned the monkey to a choice state; therefore, monkeys were free to change their mind if they did so within 250 ms (although, in our observations, they seldom did so). After a successful 250 ms fixation, visual feedback was provided. Correct choices were followed by positive visual feedback (a green outline around the chosen stimulus) and incorrect choices were followed by negative feedback (a red outline around the chosen stimulus). After visual feedback, there was a 500 ms delay period in which the screen was blank. After the delay period, correct choices were followed by a liquid (water) reward. Incorrect choices were followed by no reward. All trials were separated by an 800 ms intertrial interval, which is referred to herein as the preparatory period for the next trial. During this time, the screen was blank and monkeys' gaze was unconstrained.
In each block, monkeys were required to learn and respond according to one of six specific rules (cyan, magenta, yellow, circle, star, or triangle). Because there were six rules, monkeys were required to use a trial-and-error learning process to determine the correct rule after a rule change. Rule changes occurred after 10, 15, 20, or 30 consecutive correct trials and were not explicitly cued. Block size was fixed within a session, but varied occasionally across sessions. The majority of sessions were conducted with block size of 15 (91.08%, n = 245/269 sessions) and a minority were conducted with a block size of 10 (0.37%, n = 1/269), 20 (2.23%, n = 6/269), or 30 (6.32%, n = 17/269). Because rule switches were not cued, monkeys typically responded incorrectly on the first trial of each block (the inevitable error trial). After the inevitable error trial, monkeys began a trial-and-error process of discovering the new rule.
To identify the point at which monkeys switched to a new rule, a series of monkeys' correct trials at various points after a rule change were examined. Specifically, their accuracy on the trial immediately after the first instance of completing 1, 2, 3, 4, 5, or 6 consecutive correct trials was determined. These findings are described in Sleezer and Hayden (2016). Monkeys' accuracy plateaued after completing four consecutive correct trials. Based on these findings, we reasoned that monkeys had likely fully switched to the new rule when they completed at least four consecutive correct trials. Therefore, the point of rule acquisition was defined as the first trial in the first series of at least four consecutive correct trials in the block.
Analysis of behavioral performance across different types of rule changes.
To determine whether monkeys' performance differed depending on the type of rule change that occurred at the beginning of the block, the average number of trials that monkeys completed before rule acquisition after intradimensional and extradimensional rule changes was calculated. Intradimensional rule changes refer to instances when the rule change occurs within one rule category (i.e., color to color or shape to shape), whereas extradimensional rule changes refer to instances when the rule change occurs across rule categories (i.e., color to shape or shape to color). To compare the number of trials that monkeys completed before rule acquisition across intradimensional and extradimensional rule changes, a two-way repeated-measures ANOVA with the between-subjects factor subject (Money A, Monkey B) and the within-subjects factor block type (intradimensional, extradimensional) was used. Post hoc Fisher's least-significant difference (LSD) tests were used to compare specific differences across groups.
Analysis of behavioral performance across rules and rule categories.
To assess behavioral performance across rules and rule categories, the average accuracy on blocks of each rule type across all sessions was calculated and a nested ANOVA with the factors rule (cyan, magenta, yellow, circle, star, triangle), rule category (color, shape), brain region (OFC, VS, and DS), and session number, with rule nested in rule category was run. This ANOVA was run separately for each monkey. Inevitable error trials (the first trial of each block) were excluded from this analysis. To control for changes in task engagement across the block, trials in which the time to achieve fixation at the start of the trial was >5 times the SEM fixation time in that session were excluded (both behavioral and neural data were excluded on these trials).
Analysis of rule-related and rule-category neural activity.
In our version of the WCST, task rules were selected at random with replacement on each block. Due to this design, occasionally, not all of the six rules were sampled throughout the duration of a session. Neurons recorded during these sessions were excluded from all analyses, no other exclusion of neurons occurred. Of the total set of neurons recorded, four were excluded from OFC, 17 from VS, and 12 from DS.
Because the average neural activity across blocks was examined, it is possible that slow fluctuations in baseline firing rate across sessions (either due to attention, satiation, or neuron isolation, among other factors) could lead to spurious correlations between firing rate and rule information. For example, if baseline firing rates tended to be higher during the first half of the session and magenta rule blocks happened by chance to occur more in the first half of the session, these slow fluctuations could drive firing rates and neurons with this type of baseline fluctuation would appear to be encoding the magenta rule. These slow fluctuation biases are not nearly as likely in studies that have a trial as opposed to block structure (Strait et al., 2014).
Therefore, for all rule-related analyses, a normalization method that subtracts out slow fluctuations in baseline firing rate across sessions was used. This method is based on that used by Mansouri et al. (2006) for the same purpose. Specifically, a “local average” for the current block was first calculated by adding the average firing rates during the previous, current, and next blocks using weights of 0.25, 0.5, and 0.25, respectively. For the first block, the local average was calculated by taking the average across the first two blocks and, for the last block, the local average was calculated by taking the average across the last two blocks. The local average was then subtracted from the firing rate activity on each trial. In this procedure, all calculations were performed separately for each 10 ms bin of spike data in each trial. Specifically, the average firing rates in each bin for the current block, for the preceding block, and for the following block were used. This average was then used to normalize the firing on a trial-by-trial basis for each 10 ms time bin separately. By doing so, we were able to correct for slow fluctuations across blocks while preserving the structure of the data within trials.
Rule selectivity across time.
To examine average rule selectivity in OFC, VS, and DS neurons across time within trials, the normalized firing rate activity on each trial was calculated (using the local average procedure) and a sliding ANOVA with the factors rule category (color and shape) and rule (cyan, magenta, yellow, circle, star, and trial) were used, with rule nested in rule category and a window size of 800 ms, slid in 20 ms steps, across a total of 7340 ms. Because monkeys were required to fixate on a central point of the screen before the presentation of choice options and also before choice, firing rate data were aligned in three ways across the trial to eliminate variations in the length of time required for monkeys to fixate at each point. Specifically, data were aligned to the previous feedback period, the presentation of the first choice option, and to the feedback period on the current trial, which allowed us to look at the preparatory period, the presentation period, and the choice period, respectively.
Rule selectivity during choice and preparatory periods.
To examine average rule selectivity in OFC, VS, and DS neurons across the choice and preparatory period epochs, the normalized firing rate activity during these epochs was calculated first. The preparatory period consisted of the 800 ms between trials in which the screen was blank. Because choice-related neural activity typically lasted longer than the 250 ms that monkeys were required to fixate on selected options, both the 250 ms fixation period and the following 400 ms feedback period were included in analyses of the choice period and this is referred to simply as the “choice period.” A nested ANOVA with the factors rule category (color and shape) and rule (cyan, magenta, yellow, circle, star, and trial), with rule nested in rule category was then used. All correct trials in this analysis were included because we were first interested in determining, generally, whether neurons demonstrated rule encoding across the block. Our later analyses examine more specifically whether rule encoding differed before and after rule acquisition.
Rule selectivity during presentation epochs.
To examine rule selectivity during the three presentation epochs, the normalized firing rate during each epoch was calculated across all trials and a nested ANOVA with the following factors was used: rule-category-relevant stimulus attributes, rule-category-irrelevant stimulus attributes, rule category, and order of presentation. We were particularly interested in examining the main effects of rule category and of category-relevant and irrelevant attributes. In addition, we were interested in examining the interaction between category-relevant and category-irrelevant attributes because this measure would provide a means of looking at encoding of both stimulus dimensions. Monkeys made at least one saccade to presented stimuli the majority of the time. However, because monkeys did not always look at presented stimuli, all analysis of neural activity during the presentation period was restricted to instances in which monkeys looked at each of the three stimuli.
The order of presentation was included as a factor in the ANOVA to control for differences that might exist depending on whether the stimulus was presented first, second, or third. Category relevant and category irrelevant stimulus attributes were nested in rule category because relevancy was determined by rule category. For example, if the rule category is color, then the category-relevant stimulus attributes are cyan, magenta, and yellow and the category-irrelevant stimulus attributes are circle, star, and triangle. Alternatively, if the rule category is shape, then the category relevant stimulus attributes are circle, star, and triangle and the category-irrelevant stimulus attributes are cyan, magenta, and yellow.
This ANOVA was run using the normalized firing rate for each presentation epoch for trials before and after rule acquisition (defined as the first correct trial in a series of at least four consecutive correct trials; see Materials and Methods). Because we were particularly interested in examining neural activity related to rule category and stimulus attributes within rule categories before and after learning, these analyses were limited to instances when monkeys switched rules across rule categories (i.e., an extradimensional switch including instances when monkeys switched from color rules to shape rules or from shape rules to color rules).
To determine whether the proportions of cells demonstrating an effect were significantly above chance, binomial tests were used. To determine whether the proportions of cells demonstrating an effect were significantly different across brain regions or trial periods, binomial logistic regression was used. For all cases, the model was fit using a generalized estimating equation procedure implemented in SPSS. In this procedure, an omnibus Wald χ2 test was applied to determine the significance of group effects, followed by pairwise comparisons using Fisher's LSD tests to examine specific group effects.
Computation of selectivity indices.
To quantify the strength of rule and rule category selectivity, a rule selectivity index (RSI) and a rule category selectivity index (CSI) were calculated. For both RSI and CSI, all neurons that had enough trials were analyzed. To calculate selectivity indices, the average normalized firing rate was first computed for each of the six rules. Because our normalization procedure yielded negative values in some cases, average normalized firing rates were rectified across rules. To do this, the most negative average normalized response across the six rules was identified and added the absolute value of this number to the average normalized response for all six rules, thus setting the most negative value to zero and scaling the remaining values accordingly. The RSI was then calculated using the following formula (Moody et al., 1998): where n is the total number of rules, λi is the firing rate of the neuron for the ith rule, and λmax is the neuron's firing rate for the rule that elicited the maximum firing rate. RSI values, which range from 0 to 1, indicate how well a neuron differentiates between individual rules; values closer to 0 indicate low selectivity and values closer to 1 indicate high selectivity. For example, a cell that responds strongly to cyan but weakly to the remaining five rules would have a high RSI value, whereas a cell responding equally to all six rules would have a RSI value close to 0.
CSI values, which range from −1 to 1, indicate how well neurons differentiate between rule categories while also taking into account how well neurons differentiate between rules within rule categories. To calculate the CSI, the between-category difference was first determined by calculating the average absolute difference in firing rates for each pair of rules between rule categories and then determining the within-category difference by calculating the average absolute difference in firing rates for each pair of rules within rule categories. The within-category difference was then subtracted from the between category difference and this value was divided by the sum of the within-category difference and the between-category difference. CSI values closer to 1 indicate higher rule category selectivity.
Significance testing for selectivity indices.
To determine whether population selectivity (for both RSI and CSI) was significantly difference from chance, the trial labels were shuffled and the selectivity index calculated 100 times. For each neuron and on each iteration, one of the shuffled values was selected randomly and subtracted from the unshuffled selectivity index and then the mean of this difference was calculated across neurons; this procedure was repeated 100 times. This procedure thus resulted in 100 × 100 = 10,000 resampling iterations. Next, it was determined whether the resulting distributions of difference values (i.e., RSI − shuffled RSI and CSI − shuffled CSI) were different from zero using one-sample Wilcoxon signed-rank tests. To compare selectivity across populations of VS, DS, and OFC neurons, the same difference measures were calculated and Wilcoxon matched-pairs tests were applied.
Because Subject B's accuracy differed across rules and rule categories (see Results), several steps were taken to determine whether differences in behavioral performance influenced firing rates. First, a correlation between normalized firing rate and accuracy on each rule type (percentage correct performance) was run for each subject separately. Then, because Subject C performed equally well across rules and rule categories, it was determined whether Subject B and Subject C differed in regard to the number of cells demonstrating an effect of rule or rule category during the preparatory period, the choice period, or the presentation period. To determine whether the proportions of cells demonstrating an effect were significantly different across Subject B and Subject C, χ2 tests for independent samples were performed.
Rule modulation across correct and error trials and after intradimensional and extradimensional rule changes.
To determine whether rule encoding in OFC or striatum was related to monkeys' behavior and if it differed depending on the type of rule change that occurred at the beginning of the block (intradimensional or extradimensional), rule-related modulation was examined during three different trial types (perseverative errors, correct responses before rule acquisition, and correct responses after rule acquisition) and across intradimensional or extradimensional blocks. Perseverative errors were defined as instances in which monkeys chose according to the previous rule before rule acquisition (the first trial in a series of four consecutive correct trials; Sleezer and Hayden, 2016). Because this analysis required comparison of three different trial types across two different block types, we did not have enough power to examine rule modulation across each of the six rules individually as we did in our previous analyses. Therefore, rule modulation was examined by determining, for each neuron and on each block, whether the current rule was more preferred or less preferred than the previous rule. More specifically, for each neuron, the average normalized firing rate for each rule during each epoch of interest (the choice period and the preparatory period, separately) was calculated across all trials. Then it was determined, for each block, whether the current rule was less preferred or more preferred than the previous rule. This allowed us to compare firing rate activity across two different conditions: (1) blocks in which the rule changed from a less preferred rule to a more preferred rule and (2) blocks in which the rule changed from a more preferred rule to a less preferred rule. The difference in normalized firing rate activity in each condition was taken as our measure of rule modulation and the population averages for OFC, VS, and DS were calculated by averaging across neurons. Finally, a four-way mixed-model ANOVA using the within-subjects factors epoch (preparatory period or choice period), block type (intradimensional or extradimensional), and trial type (perseverative errors, preacquisition correct trials, and postacquisition correct trials), and the between-subjects factor brain region (OFC, VS, DS) was performed. In this analysis, “within-subjects” and “between-subjects” refer to neurons. Post hoc Fisher's LSD tests were conducted to examine individual differences across conditions.
Generation of peristimulus time histograms (PSTHs).
PSTHs were constructed by aligning spike rasters to the onset of visual feedback and averaging firing rates across multiple trials. Firing rates were calculated in 10 ms bins. For display, example cell PSTHs were smoothed using a Gaussian kernel (σ = 500 ms). Figures depicting the average proportion of cells demonstrating a significant effect across time within the trial were constructed by calculating each measure for each cell, using a 500 ms sliding window, slid in 20 ms steps. Neurons were then averaged to obtain an average measure for the populations of OFC, VS, and DS cells.
Statistical analyses were performed using MATLAB release 2012b (The MathWorks), SPSS Statistics version 24 (IBM Analytics), and GraphPad Prism version 6.
Results
These results include new analyses on previously published data (VS and DS data; Sleezer and Hayden, 2016) and on new analyses of unpublished data using the same two subjects (OFC data). In our previous study, we examined neural activity related to rule switching in VS and DS. In the current study, we were interested in examining neural activity related to rule encoding and rule category encoding in OFC, VS, and DS. All data appearing below are new.
Two trained macaques performed 269 sessions of the WCST. Overall, monkeys performed quite well. In each block, monkeys engaged in a brief exploration phase before settling into a stable rule maintenance phase (Fig. 2A). To determine whether monkeys' performance differed depending on the type of rule change that occurred at the beginning of the block, we examined the number of trials monkeys completed before rule acquisition after intradimensional and extradimensional rule changes using a two-way repeated-measures ANOVA with the between-subjects factor subject (Money A, Monkey B) and the within-subjects factor block type (intradimensional, extradimensional). This analysis revealed a significant main effect of subject (F(1,267) = 75.70; p < 0.0001) and of block type (F(1,267) = 136.2; p < 0.0001), but no interaction between the two (F(1,267) = 0.7783; p = 0.3785). The results of our post hoc comparisons are shown in Figure 2B. We found that both monkeys completed more trials before rule acquisition after extradimensional rule changes compared with intradimensional rule changes (p < 0.0001 for both monkeys). This finding is consistent with previous studies (for review, see Brown and Tait, 2016) and likely arises due to the cost of shifting attention from one rule category to another. Therefore, our results suggest that monkeys develop and maintain a mental representation of the appropriate rule category when making choices and subsequently readjust their attention toward the opposite rule category when switching across categories.
For the present study, we were particularly interested in knowing whether performance (i.e., accuracy), and thus reward expectation, was correlated with rule or rule category because this factor could lead to spurious correlations between firing rate and rule information. Figure 2C shows monkeys' average percentage performance for each rule. To compare performance across rules and rule categories, we used a nested ANOVA with the factors rule (cyan, magenta, yellow, circle, star, triangle), rule category (color, shape), brain region (OFC, VS, and DS), and session number, with rule nested in rule category. In Subject C, we found no main effect of rule (F(4,925) = 2.0779; p = 0.0817), rule category (F(1,925) = 0.3376; p = 0.5614), or brain region (F(2,925) = 1.0481; p = 0.3510) and no interaction between rule and brain region (F(8,925) = 1.6046; p = 0.1193) or between rule category and brain region (F(2,925) = 1.2668; p = 0.2822). In Subject B, we found a significant main effect of rule (F(4,925) = 7.4251; p < 0.0001) and a significant main effect of rule category (F(1,925) = 55.5679; p < 0.0001), but no main effect of brain region (F(2,925) = 1.9647; p = 0.1415) and no interaction between rule and brain region (F(8,925) = 1.8022; p = 0.0730) or between rule category and brain region (F(2,925) = 2.2193; p = 0.1100). These results suggest that Subject C performed equally well across rules, rule categories, and OFC, VS, and DS recordings, whereas Subject B demonstrated differential performance across rules and rule categories, but not across OFC, VS, and DS recordings. We performed several subsequent analyses to verify that our effects were not artifactual consequences of these biases.
OFC, VS, and DS neurons demonstrate rule and rule category modulation during choice and preparatory periods
We recorded activity during this task from a total of 422 neurons, including 115 in OFC (49 from Subject B and 66 from Subject C), 103 in VS (47 from Subject B and 56 from Subject C), and 204 in DS (77 from Subject B and 127 from Subject C). We excluded four neurons from OFC, 17 neurons from VS, and 12 neurons from DS because not all of the six rules were sampled throughout the duration of a session while these neurons were recorded. No other exclusion of neurons occurred.
We first quantified the number of cells demonstrating rule modulation using a multifactorial nested ANOVA procedure (see Materials and Methods). Figure 3, A–C, shows the average response of example neurons from OFC, VS, and DS, respectively, that each have a significant main effect of rule during the choice period (OFC, F(4,442) = 2.9121; p = 0.0213 and VS, F(4,417) = 2.7027; p = 0.0302) or the preparatory period (DS, F(4,484) = 2.5886; p = 0.0362).
Figure 3, D–F, shows the proportion of all cells demonstrating a significant effect of rule in the OFC, VS, and DS across time. Rule-related modulation was more common than would be expected by chance in all three regions during both the preparatory period and the choice period. These results are summarized in Table 1.
We also examined the proportion of cells demonstrating rule category modulation. Figure 4, A–C, shows the average response of example neurons from OFC, VS, and DS, respectively, demonstrating a significant main effect of rule category during the choice period (OFC and DS, F(1,471) = 26.3087; p < 0.0001, F(1,488) = 8.2836; p = 0.0042) and preparatory period (VS, F(1,1317) = 4.6042; p = 0.0321). Figure 4, D–F, shows the proportion of cells modulated by rule category in the OFC, VS, and DS across time within trials. We found that rule category modulation was more common than would be expected by chance in all three brain regions during the preparatory period and more common that would be expected by chance in OFC and DS, but not VS, during the choice period. These results are summarized in Table 1.
To compare directly the proportions of cells demonstrating rule and rule category modulation across groups, we implemented a mixed-model binary logistic regression procedure using the between-subjects factor brain region (OFC, VS, DS) and the within-subjects factors epoch (preparatory, choice) and task variable (rule, rule category). In this analysis, “within-subjects” and “between-subjects” refer to neurons. In this procedure, an omnibus Wald χ2 test was applied to determine the significance of group effects. Although we found a significant main effect of task variable (χ2 = 6.8033, p = 0.0091), we did not find a significant main effect of brain region (χ2 = 3.1989, p = 0.2020) or epoch (χ2 = 0.0188, p = 0.8911), nor did we find any other significant interactions with brain region (p > 0.05 for all comparisons). Because we did not find a significant main effect of brain region or any interactions with brain region, we concluded that the proportion of cells demonstrating rule-related modulation did not differ across OFC, VS, and DS. Therefore, we did not conduct further pairwise comparisons to examine differences across regions.
We next examined how rule encoding changed as monkeys moved from learning the rule (early in blocks) to maintaining it (late in blocks). To do this, we repeated our previous analyses separately for correct trials before and after the point of rule acquisition (i.e., the first correct trial in a series of at least four consecutive correct trials; Sleezer and Hayden, 2016). We again performed a mixed-model binary logistic regression procedure, this time using the between-subjects factor brain region (OFC, VS, DS) and the within-subjects factors trial period (preacquisition, postacquisition), epoch (preparatory, choice), and task variable (rule, rule category). In this analysis, “within-subjects” and “between-subjects” refer to neurons. We found a significant main effect of trial period (χ2 = 30.9182, p < 0.0001) and of task variable (χ2 = 6.1995, p = 0.0128), but no main effect of brain region (χ2 = 2.8560, p = 0.2398) or epoch (χ2 = 0.0513, p = 0.8209). We also did not find any significant interactions between any of the four variables (p > 0.05 for all comparisons). Because we did not find a significant main effect of brain region or any interactions with brain region, we concluded that the proportion of cells demonstrating rule-related modulation before and after rule acquisition did not differ across OFC, VS, and DS. Therefore, we did not conduct subsequent pairwise comparisons to examine differences in the proportion of cells demonstrating rule-related modulation across regions.
To further examine the main effect of trial period, we conducted pairwise comparisons using Fisher's LSD tests to compare proportions of significant cells across the preacquisition and postacquisition trial periods. The results of these analyses are shown in Figure 5 and Table 2. Overall, we found that the proportion of neurons demonstrating a significant effect of rule and rule category tended to be above chance only after rule acquisition in most, but not all, cases.
Strength of rule and rule category modulation in OFC, VS, and DS
We next examined the strength of rule and rule category selectivity. To quantify selectivity, we calculated an RSI and a CSI (see Materials and Methods). RSI values, which range from 0 to 1, indicate how well a neuron differentiates between individual rules, with values closer to 0 indicating low selectivity and values closer to 1 indicating high selectivity. CSI values, which range from −1 to 1, indicate how well neurons differentiate between rule categories (shape or color). CSI values closer to 1 indicate higher rule category selectivity (see Materials and Methods).
Figure 6, A–F, shows the distributions of RSI and CSI differences (i.e., RSI − shuffled RSI and CSI − shuffled CSI) before and after rule acquisition in OFC, VS, and DS during the preparatory period. In OFC, during the preparatory period, both indices were greater than chance before and after rule acquisition (Fig. 6A,D, light blue and dark blue bars, p < 0.0001 for all comparisons, Wilcoxon signed-rank tests). In VS, RSI and CSI values were significantly lower than chance before rule acquisition and significantly greater than chance after rule acquisition (Fig. 6B,E, light orange and dark orange bars, p < 0.0001 for all four comparisons). In DS, we found that RSI values were significantly lower than chance before rule acquisition (p < 0.0001) and after rule acquisition (p = 0.0004), whereas CSI values were significantly lower than chance before rule acquisition (p < 0.0001) and significantly greater than chance after rule acquisition (p < 0.0001; Fig. 6C,F, light green and dark green bars). For all three regions, we found that both RSI and CSI values were greater after rule acquisition compared with before rule acquisition (p < 0.0001 for all six comparisons, Wilcoxon matched-pairs tests). Overall, these results suggest that, after rule acquisition, OFC, VS, and DS neurons are selective for rule and rule categories and this selectivity is greater than selectivity before rule acquisition. These data thus show that rule encoding in all three areas tracks rule learning directly.
Figure 6, G–L, shows the distributions of RSI and CSI differences before and after rule acquisition in OFC, VS, and DS during the choice period. In OFC, we found that RSI values were significantly lower than chance before rule acquisition and after rule acquisition, whereas CSI values were significantly greater than chance before and after rule acquisition (Fig. 6G,J, light blue and dark blue bars, p < 0.0001 for all four comparisons, Wilcoxon signed-rank tests). Note that, for this analysis, lower than chance indicates a coding scheme that is more dense than a Gaussian/normal distribution. In VS, we found that RSI and CSI values were not different from chance before rule acquisition (p = 0.4174 and p = 0.7678), but were significantly greater than chance after rule acquisition (p = 0.0201 and p < 0.0001; Fig. 6H,K, light orange and dark orange bars). In DS, we found that RSI values were significantly lower than chance before rule acquisition and after rule acquisition (p < 0.0001 and p < 0.0001), whereas CSI values were not different from chance before rule acquisition (p = 0.9829) and were significantly greater than chance after rule acquisition (p < 0.0001; Fig. 6I,L, light green and dark green bars).
RSI values did not differ before and after acquisition in OFC and DS (p = 0.2790 and p = 0.6298, Wilcoxon matched-pairs tests), but increased significantly in VS after rule acquisition (p = 0.0294). For all three regions, CSI values increased with learning (p < 0.0001 for all three comparisons). Together, these results demonstrate that, in contrast to the preparatory period, only VS, not OFC and DS, demonstrate greater rule selectivity after rule acquisition compared with before rule acquisition, whereas all three regions show greater rule category selectivity after rule acquisition compared with before rule acquisition.
Rule-related modulation on correct and error trials and across intradimensional and extradimensional blocks
We next wanted to know whether rule encoding in OFC or striatum was related to the monkeys' behavior, particularly on perseverative error trials, and also whether rule encoding differed depending on the type of rule change that occurred at the beginning of the block (intradimensional or extradimensional). To do this, we examined rule-related modulation during the preparatory and choice periods on perseverative error trials, correct responses before rule acquisition, and correct responses after rule acquisition and across blocks after intradimensional or extradimensional rule changes.
Because this analysis required us to compare three different trial types across two different block types, we did not have enough power to examine rule modulation across each of the six rules individually, as we did in our previous analyses. Therefore, we examined rule modulation by calculating firing rate activity on blocks in which the rule switched from a less preferred rule to a more preferred rule and on blocks in which the rule switched from a more preferred rule to a less preferred rule (see Materials and Methods). We used the difference between these two conditions as our measure of rule modulation. We then performed a four-way mixed-model ANOVA on these difference measures using the within-subjects factors epoch (preparatory period, choice period), block type (intradimensional, extradimensional), and trial type (perseverative errors, preacquisition correct trials, postacquisition correct trials) and the between-subjects factor brain region (OFC, VS, DS). In this analysis, “within-subjects” and “between-subjects” refer to neurons.
The results of the ANOVA are shown in Table 3. To examine these effects in more detail, we performed several subsequent post hoc analyses. First, we were interested in looking at the effect of block type on rule modulation across trial types (i.e., the significant interaction between block type and trial type). Because we did not find an interaction between block type and brain region, nor did we find any three- or four-way interactions involving block type, we conducted this set of post hoc analyses by comparing rule modulation across intradimensional and extradimensional blocks during each of the three trial types after collapsing across epochs and brain regions (Fig. 7A). This analysis revealed a significantly greater magnitude of rule modulation on intradimensional blocks compared with extradimensional blocks during perseverative error trials (p = 0.0214, Fisher's LSD test), but not during preacquisition correct trials (p = 0.7142) or during postacquisition correct trials (p = 0.1786).
Our finding that neurons demonstrated greater rule modulation during perseverative error trials on intradimensional blocks compared with perseverative error trials on extradimensional blocks is not necessarily surprising given our previous finding that neurons in all three regions carry information about rule categories (see above). More specifically, given that the rule category does not change during intradimensional rule changes, stronger rule encoding on perseverative error trials during intradimensional switches compared with extradimensional switches may reflect maintenance of rule category information after intradimensional rule changes.
In addition to examining rule modulation across block types, we also examined the main effect of trial type by comparing group means across trial types (Fig. 7A). This analysis revealed a greater magnitude of rule modulation on preacquisition correct trials compared with perseverative error trials (p < 0.0001, Fisher's LSD test) and on postacquisition correct trials compared with preacquisition correct trials (p < 0.0001). These results provide further evidence that the strength of rule modulation increases as monkeys acquire rules behaviorally.
Next, we were interested in examining differences in the strength of rule modulation across brain regions. To do this, we compared rule modulation across brain regions on each of the three trial types and during the preparatory and choice epochs (Fig. 7B–E). Because we did not find a significant interaction among brain region, trial type, epoch, and block type, we conducted this analysis by collapsing across block types. We found no difference in rule modulation across brain regions during the preparatory period on any of the three trial types (p > 0.05 for all comparisons, Fisher's LSD tests). In contrast, during the choice period, rule modulation was significantly greater in VS compared with OFC on perseverative error trials (p = 0.0145) and significantly greater in VS compared with both OFC and DS on preacquisition correct trials (p = 0.0173 and p = 0.0232). These results suggest that that information about correct rules may begin to arise earlier in VS compared with DS and OFC. These results are consistent with our previous finding that VS signals rule switches early in the trial-and-error learning period, whereas DS signals rule switches later, once rules are full acquired (Sleezer and Hayden, 2016).
We also investigated whether the timing of rule modulation within trials differed across OFC, VS, and DS. Specifically, we examined the time to maximum rule modulation (i.e., the time to the maximum difference in normalized firing rate between blocks in which the rule switched from a less preferred rule to a more preferred rule and blocks in which the rule switched from a more preferred rule to a less preferred rule) across the populations of OFC, VS, and DS neurons during both the preparatory and choice periods. To compare the distribution of cell latencies across regions, we used Kruskal–Wallis tests. The average latencies for OFC, VS, and DS during the preparatory period were as follows: 410.60, 349.55, and 387.50 ms, respectively, and the average latencies during the choice epoch were as follows: 144.23, 129.30, and 131.62 ms, respectively. We did not find a significant difference among OFC, VS, and DS for either epoch (preparatory epoch: p = 0.1905; choice epoch: p = 0.3340).
Rule-related modulation during the presentation of choice options
We hypothesized that OFC and striatum do not serve solely as a site of storage, but instead that their rule representations serve the animals' goals by guiding decisions. This proposal predicts that neural responses to the offers will depend on the rule and the offer identity (Miller and Desimone, 1994; Chelazzi et al., 1998; Romo and Salinas, 2003; Mirabella et al., 2007; Hernández et al., 2010; Lui and Pasternak, 2011; Hayden and Gallant, 2013).
To test this idea, we used a nested ANOVA with the following factors: category-relevant stimulus attributes, category-irrelevant stimulus attributes, rule category, and order of presentation (see Materials and Methods). We ran this ANOVA using the average firing rate for each presentation epoch for trials before and after rule acquisition (Sleezer and Hayden, 2016). We included only instances in which monkeys made at least one saccade to each choice option in this analysis (64.40% for stimulus one, 63.18% for stimulus two, and 55.99% for stimulus three).
The results of this analysis are shown in Figure 8 and Table 4. We found that activity in a significant proportion of cells was modulated by rule category during the presentation period after rule acquisition in OFC, VS, and DS. We also found that a significant number of OFC cells differentiated between stimulus attributes within the relevant rule category before and after rule acquisition, whereas a significant number of VS cells demonstrated this effect after rule acquisition. Finally, we also found that a significant number of cells in OFC demonstrated an interaction between category-relevant and category-irrelevant stimulus attributes before rule acquisition, but not after rule acquisition.
To compare directly the proportions of cells demonstrating rule-related modulation during the presentation of choice options across groups, we implemented a mixed-model binary logistic regression procedure using the between-subjects factor brain region (OFC, VS, DS) and the within-subjects factors trial period (preacquisition, postacquisition) and task variable (rule category, category-relevant stimulus attributes, category-irrelevant stimulus attributes, and the interaction between category-relevant and category-irrelevant attributes). In this analysis, “within-subjects” and “between-subjects” refer to neurons. In this procedure, an omnibus Wald χ2 test was applied to determine the significance of group effects. We found a significant main effect of trial period (χ2 = 4.0363, p = 0.0445), but no main effect of brain region (χ2 = 5.4495, p = 0.0656) or task variable (χ2 = 7.4209, p = 0.0596). We also found a significant interaction between task variable and trial period (χ2 = 23.3590, p < 0.0001), but no other interactions (p > 0.05 for all comparisons). Because we did not find a significant main effect of brain region, or any interactions with brain region, we concluded that the proportion of cells demonstrating rule-related modulation during the presentation of choice options before and after rule acquisition did not differ across OFC, VS, and DS. Therefore, we did not conduct subsequent pairwise comparisons to examine differences across brain regions.
To further examine the effect of trial period, we conducted pairwise comparisons using Fisher's LSD tests to compare proportions of significant cells across the preacquisition and postacquisition trial periods. We found that a significantly greater proportion of cells demonstrated rule category modulation after rule acquisition compared with before rule acquisition in all three brain regions (OFC: p = 0.0137, VS: p = 0.0003, DS: p = 0.0008). In VS, we found that a significantly greater proportion of cells differentiated between stimulus attributes in the relevant dimension after rule acquisition compared with before rule acquisition (p = 0.0280) and, in OFC, we found that a significantly greater proportion of cells demonstrated an interaction between relevant and irrelevant stimulus attributes before rule acquisition compared with after rule acquisition (p = 0.0202).
These results indicate that all three regions contribute to rule category encoding after rule acquisition, which may reflect a role in directing attention toward, or maintaining information about, relevant rule categories during the evaluation of choice options. These results also suggest that OFC neurons carry information about both stimulus dimensions before, but not after, learning the relevant rule. Finally, our results further suggest that OFC and VS neurons carry information about specific stimulus attributes within relevant rule categories.
Rule-related neural modulation is not driven by differences in behavioral performance
Because Subject B, but not Subject C, demonstrated differences in performance across rules and rule categories, we ran several analyses to determine whether performance or reward expectation drove differences in firing rate activity across rules and rule categories. First, we ran a correlation between average firing rate activity and average accuracy on blocks of each rule type for each subject separately and during each of the three analysis epochs (the preparatory, presentation, and choice epochs). We found no correlation between firing rate and performance for Subject B during the preparatory period (p = 0.2405), the presentation period (p = 0.1583), or the choice period (p = 0.2067). Similarly, we found no correlation between firing rate and performance for Subject C during the preparatory period (p = 0.3925), the presentation period (p = 0.6764), or the choice period (p = 0.2888). These results suggest that, across the populations of neurons, differences in performance did not have a measurable overall effect on firing rate activity.
To determine whether differences in behavioral performance might affect the prevalence of rule or rule category encoding in individual neurons more specifically, we tested whether Subject B and Subject C differed in regard to the number of cells demonstrating an effect of rule or rule category during the preparatory period, the choice period, or the presentation period. Because Subject B demonstrated differences in performance across rules and rule categories, these differences could drive differences in firing rate activity across rules and rule categories. However, because Subject C did not demonstrate differences in performance across rules or rule categories, it is unlikely that performance drove firing rates in Subject C. Therefore, if Subject B and Subject C demonstrate a similar proportion of cells modulated by rule and rule category, then this would suggest that performance differences did not affect neurons' firing rate activity differentially. To determine whether behavior affected firing rate activity, we compared proportions of cells modulated by rule and rule category across monkeys from our previous analyses examining rule-related activity after rule acquisition. We limited this analysis to postacquisition trials because, overall, most neurons encoded rules and rule categories after rule acquisition.
Overall, we found no evidence that the proportions of cells modulated by rule or rule category were different between Subject B and Subject C in any of the three regions or in any of the three analysis epochs, suggesting that rule modulation in single neurons was not driven by differences in behavioral performance. Specifically, in our analysis of rule encoding during the preparatory and choice periods, we found that an equal number of OFC cells from each subject demonstrated rule modulation during the preparatory period (8/46 and 4/65, χ2 = 3.528, p = 0.0603, χ2 test) and the choice period (6/46 and 10/65, χ2 = 0.120, p = 0.7294), that an equal number of VS cells from each subject demonstrated rule modulation during the preparatory period (6/37 and 4/49, χ2 = 1.330, p = 0.2487) and the choice period (10/37 and 7/49, χ2 = 2.158, p = 0.1418), and that an equal number of DS cells from each subject demonstrated rule modulation during the preparatory period (17/66 and 18/126, χ2 = 3.824. p = 0.0505) and the choice period (9/66 and 12/126, χ2 = 0.752, p = 0.3858).
Similarly, we found that an equal number of OFC cells from each subject demonstrated rule category modulation during the preparatory period (5/46 and 11/65, χ2 = 0.800, p = 0.3711) and the choice period (4/46 and 10/65, χ2 = 1.093, p = 0.2957), that an equal number of VS cells from each subject demonstrated rule category modulation during the preparatory period (7/37 and 4/49, χ2 = 2.186, p = 0.1392) and the choice period (6/37 and 4/49, χ2 = 1.330, p = 0.2487), and that an equal number of DS cells from each subject demonstrated rule category modulation during the preparatory period (13/66 and 13/126, χ2 = 3.255, p = 0.0712) and the choice period (6/66 and 9/126, χ2 = 0.228, p = 0.6328).
Finally, in our analysis of the presentation period, we also found that an equal number of cells from each subject demonstrated rule category modulation during the presentation period in OFC (5/46 and 12/65, χ2 = 1.197, p = 0.2739), in VS (12/37 and 11/49, χ2 = 1.072, p = 0.3004), and in DS (12/66 and 19/126, χ2 = 0.308, p = 0.5790). Together, these results suggest that rule-related modulation in OFC, VS, and DS neurons was not driven by differences in performance across rules or rule categories during any of the three analysis epochs. Therefore, the most parsimonious explanation for our results is that firing rates were driven by rules and rule categories in OFC, VS, and DS independently of behavioral performance and thus independently of reward expectation.
Discussion
We examined how OFC, VS, and DS contribute to rule-based decision making using a primate version of the WCST. Neurons in all three regions showed reliable changes in firing rate for rule identity and rule category. These changes were observed throughout the trial, including the preparatory period, the presentation period, and the choice period. During the presentation period, we found that rule influenced neural responses to offer identities, suggesting that rule encoding guides selection of upcoming targets. These findings are consistent with the idea that rule maintenance involves, in part, changes in the routing properties of neurons in the reward system.
Rule maintenance is a basic form of working memory: stored information that serves to modulate responses. Classic theories of working memory hold that it is stored in specialized and dedicated circuits, especially DLPFC (Fuster and Alexander, 1971; Funahashi, 2001). However, more recent research suggests an alternative possibility: that working memory may be stored in the form of changes in the response properties of sensory associative neurons (Pasternak and Greenlee, 2005; Postle, 2006). Such changes can then serve to modulate sensory responses, thus implementing a decision process (Machens et al., 2005; Mirabella et al., 2007; Hayden and Gallant, 2013). Our research suggests a natural extension of these ideas: that rule maintenance involves systematic brainwide changes in neural responsiveness in regions that are relevant for task performance.
OFC and VS are sometimes thought to be pure and selective reward areas; well known theories hold that these regions are specialized for representing possible and realized rewards, for developing and maintaining stimulus–reward associations, and/or for tracking changing rewards in dynamic environments (Apicella et al., 1991; Schultz et al., 1992; Kringelbach, 2005; Delgado, 2007; Wallis, 2007; Padoa-Schioppa, 2011; Diekhof et al., 2012; Stalnaker et al., 2015). Evidence has also implicated the OFC in emotional switching (Dias et al., 1996). However, a great deal of evidence also indicates an executive role for these areas (Diekhof et al., 2011; Bryden and Roesch, 2015; Stalnaker et al., 2015; Bissonette and Roesch, 2016; Sleezer and Hayden, 2016; Strait et al., 2016). Indeed, a recent comprehensive review challenges the “pure reward” viewpoint of VS and argues instead for a broader executive influence over actions (Floresco, 2015). The present results confirm these earlier ones and expand our understanding of rule maintenance in OFC, VS, and DS. These results are particularly clear evidence for the involvement of OFC, VS, and DS in executive functions because rule is experimentally dissociable from reward information, unlike error monitoring, conflict monitoring, self-control, and behavioral adjustment, which are highly correlated with, and thus difficult to disentangle from, reward variables (O'Doherty, 2014; Heilbronner and Hayden, 2016).
Few studies have examined the role of core reward regions in rule representation. Among the studies that have, the results have been somewhat conflicting. Most notably, whereas one study found that single neurons in OFC encode rules during a delay period between rule instruction and choice (suggesting a role in rule maintenance; Wallis et al., 2001), a later study found that lesions of the OFC in monkeys impair only the learning of rule–value associations, not rule maintenance (Buckley et al., 2009). Our results confirm and extend the original Wallis discovery. First, we extend to two new brain areas and to rule categories. Second, we show that rule representation extends to rules that are not explicitly cued, but are instead learned through trial and error. Third, we demonstrate a role for rule encoding in target selection. Finally, we provide a description of the dynamics of rule and category representation across learning.
Perhaps the most relevant results to these are those from Schoenbaum and colleagues (Schoenbaum et al., 1999, 2003; Wilson et al., 2014; Stalnaker et al., 2015), who concluded that OFC encodes a suite of task-relevant variables that together instantiate the cognitive map of task space. Like the rule encoding that we observed here, this representation could serve to guide behavior appropriately (for a similar argument, see Blanchard et al., 2015a). Our results suggest that these arguments may also apply to VS and DS. One executive role of OFC is linking cognition to action. Several previous studies indicated that spatial information can be observed in firing rates of OFC neurons (Roesch et al., 2006; Tsujimoto et al., 2009, 2011; Abe and Lee, 2011; Luk and Wallis, 2013; Bryden and Roesch, 2015; Strait et al., 2016) and also in VS (Strait et al., 2016). By showing a link between stimulus presentation and selection, the present results suggest another, more basic role of OFC, VS, and DS in selection, one that includes identifying choice options.
Several studies suggest that OFC and VS contribute to simple types of flexible decision making and not more complex types because they maintain representations of discrete stimuli and not rules (Cools et al., 2004; Dang et al., 2012; and for review, see Robbins, 2007). In the present study, we observed encoding of both stimulus attributes (color and shape) in OFC neurons before rule acquisition and a switch to encoding rule category after rule acquisition. Therefore, OFC transitions from representing both stimulus attributes early in learning, when both are relevant, to representing the task context (i.e., rule category) later, when only one stimulus dimension is relevant. This finding suggests that OFC represents the most important task variables preferentially as their relevancy changes with learning. This result is consistent with previous work suggesting that OFC biases attention toward the most relevant task variables at hand (Diekhof et al., 2011).
Several pieces of information link VS to rule-based switching. For example, VS inactivation impairs behavioral performance during rule switching (Floresco et al., 2006), whereas VS cholinergic interneurons have been shown to play an important role in rule switching (Aoki et al., 2015). A recent comprehensive review of VS function also proposes that VS serves to refine action selection dynamically by incorporating multiple sources of information including, but not limited to, reward (Floresco, 2015). This executive view of VS is supported by our own previous findings, which demonstrate a direct role for VS in economic decisions and in regulation of rule switching (Strait et al., 2015; Sleezer and Hayden, 2016).
Previous studies indicated that DS represents rule identity and rule order (Badre et al., 2010; Reverberi et al., 2012) and contributes to conceptually similar processes such as sequence learning (Yin, 2010). Recent research also indicates that DS is involved in resolving conflict between competing rule information during rule switching (Bissonette and Roesch, 2015). Our finding that DS represents rules during choices and periods of delay between trials corroborates these findings.
The present results complement our earlier research on the neural basis of executive control. We demonstrated previously an important role for the striatum and some of its ostensibly reward-sensitive afferents in rule switching (Hayden et al., 2010, 2011a, 2011b; Blanchard and Hayden, 2014; Sleezer and Hayden, 2016) and in persistence (Blanchard et al., 2015b). Overall, these results highlight the widespread and distributed nature of rule-based switching and argue against the view that this function is the exclusive domain of a small and highly specialized piece of brain tissue. That being said, this does not imply by any means that OFC, VS, and DS have identical contributions to cognition. Indeed, there is quite compelling evidence for strong functional differences between them (Schultz, 2000; Elliott et al., 2003; O'Doherty et al., 2004; Atallah et al., 2007; Frank and Claus, 2006; Block et al., 2007; Floresco et al., 2008). Moreover, we have shown previously that VS and DS contribute to different stages of rule switching; whereas VS neurons signal rule switches during early periods of learning, DS neurons signal switches later, once rules are known (Sleezer and Hayden, 2016). Our present results are consistent with these differential contributions. Notably, our results demonstrate that rule encoding is stronger in VS before rule acquisition compared with OFC and DS and that OFC neurons carry information about all task-relevant stimulus attributes (i.e., color and shape) when viewing choice options before, but not after, rule acquisition. Therefore, our results suggest that OFC and VS may work in conjunction to identify appropriate rules during learning, with OFC facilitating the identification of task-relevant variables and VS potentially using this information to identify more specifically which rule is appropriate. Once rules are learned, all three regions maintain rule-based behavior by representing specific rules and rule categories.
Together, our results indicate that OFC, VS, and DS all contribute to rule representation, but do so in different ways. More speculatively, the breadth of brain regions in which neural representations of rule maintenance and switching can be observed suggests a change in our understanding of what rule maintenance is. It suggests that that rule maintenance is an active storage of a set of reweightings of neuronal responsiveness, which in turn modify the decision maker's responses to task stimuli. Therefore, any region that participates in determining behavior, even in a small way, may show changes in neural activity depending on the active rule.
Footnotes
This work was supported by the National Institutes of Health (Grant R01 DA038106 to B.Y.H. and Training Fellowship T32-EY007125 to B.J.S.). We thank Marc Mancarella for general laboratory assistance and Giuliana Loconte for assistance with data collection.
The authors declare no competing financial interests.
- Correspondence should be addressed to Brianna J. Sleezer, Department of Brain and Cognitive Sciences, University of Rochester, 500 Joseph C. Wilson Blvd, Rochester, NY 14627. Brianna_Sleezer{at}urmc.rochester.edu