## Abstract

A series of studies in which monkeys chose between two juices offered in variable amounts identified in the orbitofrontal cortex (OFC) different groups of neurons encoding the value of individual options (offer value), the binary choice outcome (chosen juice), and the chosen value. These variables capture both the input and the output of the choice process, suggesting that the cell groups identified in OFC constitute the building blocks of a decision circuit. Several lines of evidence support this hypothesis. However, in previous experiments offers were presented simultaneously, raising the question of whether current notions generalize to when goods are presented or are examined in sequence. Recently, Ballesta and Padoa-Schioppa (2019) examined OFC activity under sequential offers. An analysis of neuronal responses across time windows revealed that a small number of cell groups encoded specific sequences of variables. These sequences appeared analogous to the variables identified under simultaneous offers, but the correspondence remained tentative. Thus, in the present study, we examined the relation between cell groups found under sequential versus simultaneous offers. We recorded from the OFC while monkeys chose between different juices. Trials with simultaneous and sequential offers were randomly interleaved in each session. We classified cells in each choice modality, and we examined the relation between the two classifications. We found a strong correspondence; in other words, the cell groups measured under simultaneous offers and under sequential offers were one and the same. This result indicates that economic choices under simultaneous or sequential offers rely on the same neural circuit.

**SIGNIFICANCE STATEMENT** Research in the past 20 years has shed light on the neuronal underpinnings of economic choices. A large number of results indicates that decisions between goods are formed in a neural circuit within the orbitofrontal cortex. In most previous studies, subjects chose between two goods offered simultaneously. Yet, in daily situations, goods available for choice are often presented or examined in sequence. Here we recorded neuronal activity in the primate orbitofrontal cortex alternating trials under simultaneous and under sequential offers. Our analyses demonstrate that the same neural circuit supports choices in the two modalities. Hence, current notions on the neuronal mechanisms underlying economic decisions generalize to choices under sequential offers.

## Introduction

Neurophysiology experiments where monkeys chose between different juice types identified in the orbitofrontal cortex (OFC) different groups of cells encoding individual offer values, the binary choice outcome (chosen juice) and the chosen value (Padoa-Schioppa and Assad, 2006). Similar results were obtained in monkeys choosing between juice bundles (Pastor-Bernier et al., 2019), in mice (Kuwabara et al., 2020), and in humans using fMRI (Hare et al., 2008; Howard et al., 2015). The variables encoded in OFC capture both the input and the output of the choice process, and the corresponding cell groups are computationally sufficient to generate binary decisions (Rustichini and Padoa-Schioppa, 2015; Song et al., 2017; Zhang et al., 2018). In monkeys, mild electrical stimulation of this area biases choices in predictable ways (Ballesta et al., 2020). Furthermore, lesions in humans (Camille et al., 2011; Yu et al., 2018), high current stimulation in monkeys (Ballesta et al., 2020), or optogenetic inactivation in mice (Gore et al., 2019; Kuwabara et al., 2020) dramatically increases choice variability. The circuit dynamics is consistent with a decision process (Rich and Wallis, 2016), and trial-by-trial fluctuation in the activity of each cell group correlates with choice variability (Padoa-Schioppa, 2013). Together, these results suggest that the cell groups identified in OFC constitute the building blocks of a neural circuit in which economic decisions are formed. One caveat is that current notions on this circuit emerge mostly from studies in which two options were presented simultaneously. Yet, in most daily situations, options available for choice appear or are examined in sequence. Moreover, some scholars have argued that choices under sequential or simultaneous offers rely on qualitatively different mechanisms (Kacelnik et al., 2011; Hunt et al., 2013; Hayden and Moreno-Bote, 2018).

To shed light on the mechanisms underlying choices under sequential offers, we recently recorded from the OFC of monkeys choosing between different juices offered sequentially (Ballesta and Padoa-Schioppa, 2019). Consistent with previous observations (McGinty et al., 2016; Hunt et al., 2018), neuronal responses in any time window depended on the presentation order (i.e., on what juice the animal was offered at that time). However, an analysis of neuronal responses across time windows revealed that different groups of cells encoded different patterns of variables, referred to as “sequences.” Across a large population of neurons, we identified 8 such sequences. We also noted that these sequences presented analogies with the cell groups previously identified under simultaneous offers. For example, some sequences represented the value of specific juices, while other sequences presented binary responses. These observations suggested that the two sets of cell groups recorded under sequential and under simultaneous offers might indeed be one and the same. If this hypothesis was confirmed, notions on the decision mechanisms acquired under simultaneous offers would apply to a much broader domain of choices than previously recognized.

To test this hypothesis, we recorded the activity of neurons in OFC while monkeys chose between different juices. In each session, choices under simultaneous offers and choices under sequential offers were pseudo-randomly interleaved. In the analysis, we first separated trials with the two choice tasks (modalities) and classified each cell in each choice task. We then considered the whole population and compared the results of the classification obtained for the two choice tasks. We envisioned three possible scenarios: (1) the two choice tasks could engage different neuronal assemblies (different populations); (2) the two tasks might engage the same neuronal population, but individual neurons might have different roles in the two tasks (independent classification); or (3) the same groups of neurons might support decisions in the two choice tasks (corresponding classifications). Statistical analyses provided strong evidence for the last hypothesis. Thus, our results indicate that choices under sequential offers and choices under simultaneous offers rely on the same decision circuit.

## Materials and Methods

All the experimental procedures adhered to the National Institutes of Health's *Guide for the care and use of laboratory animals* and were approved by the Institutional Animal Care and Use Committee at Washington University.

##### Animal subjects and choice tasks

Two adult male rhesus monkeys (*Macaca mulatta*; Monkey J, 10.0 kg, 8 years old; Monkey G, 9.1 kg, 9 years old) participated in this study. Before training and under general anesthesia, we implanted on each animal a head restraining device and an oval chamber (axes 50 × 30 mm). Chambers were centered on stereotaxic coordinates (A30, L0), with the longer axis parallel to coronal planes, allowing bilateral access to OFC with coronal electrode penetrations. Structural MRI scans (1 mm sections) obtained before and after surgery were used to locate OFC and guide neuronal recordings. During the experiments, monkeys sat in an electrically and acoustically insulated enclosure (Crist Instrument), with their head fixed and pink noise in the background. A computer monitor was placed in front of the animal at 57 cm distance. The gaze direction was monitored at 1 kHz using an infrared video camera (Eyelink, SR Research). The behavioral task was controlled using custom-written software (https://monkeylogic.nimh.nih.gov) (Hwang et al., 2019) based on MATLAB (version 2016a, The MathWorks).

In each session, the animal chose between two juices labeled A and B (A preferred) offered in variable amounts. Each session included trials with two choice modalities, referred to as Task 1 and Task 2 (Fig. 1*A*,*B*). The two tasks were nearly identical to those used in previous studies (Padoa-Schioppa and Assad, 2006; Ballesta and Padoa-Schioppa, 2019), and trials with the two tasks were pseudo-randomly interleaved. In both tasks, offers were represented by sets of colored squares displayed on the computer monitor. For each offer, the color indicated the juice type and the number of squares indicated the quantity. Each trial began with the animal fixating a large dot in the center of the monitor. After 0.5 s, the initial fixation point changed to a small dot or a small cross; the new fixation point cued the animal to the choice task used in that trial. In Task 1 (Fig. 1*A*), cue fixation (0.5 s) was followed by the simultaneous presentation of the two offers. After a randomly variable delay (1-1.5 s), the center fixation disappeared and two saccade targets appeared near the offers (go signal). The animal indicated its choice with an eye movement. It maintained peripheral fixation for 0.75 s, after which the chosen juice was delivered. In Task 2 (Fig. 1*B*), cue fixation (0.5 s) was followed by the presentation of one offer (0.5 s), an interoffer delay (0.5 s), presentation of the other offer (0.5 s), and a wait period (0.5 s). Two colored saccade targets then appeared on the two sides of the fixation point. After a randomly variable delay (0.5-1 s), the center fixation disappeared (go signal). The animal indicated its choice with a saccade, maintained peripheral fixation for 0.75 s, after which the chosen juice was delivered. Central and peripheral fixation were imposed within 4-6 and 5-7 degrees of visual angle, respectively.

For any given trial, *q _{A}* and

*q*indicate the quantities of juices A and B offered to the animal, respectively. An “offer type” was defined by two quantities [

_{B}*q*,

_{A}*q*]. On any given session, we used the same juices and the same sets of offer types for the two tasks. For Task 1, the spatial configuration of the offers (left/right) varied randomly from trial to trial. For Task 2, trials in which juice A was offered first and trials in which juice B was offered first were referred as “AB trials” and “BA trials,” respectively. The terms “offer1” and “offer2” indicated, respectively, the first and second offer, independently of the juice type and amount. In Task 2, the presentation order varied pseudo-randomly and was counterbalanced across trials for any offer type. The spatial location (left/right) of saccade targets varied randomly and independently of the presentation order. The juice volume corresponding to one square (quantum) was set equal for the two tasks and remained constant within each session. It varied across sessions (70-100 μl) for both monkeys. The association between the initial cue (small dot, small cross) and the choice modality (Task 1, Task 2) varied across sessions, in blocks.

_{B}In Task 2, AB trials and BA trials were analyzed separately (see below). A power analysis indicated that comparing neuronal responses across tasks would be most effective if the number of trials for Task 2 was √2 times that for Task 1. Thus, in most sessions, we set the number of trials for Task 2 equal to 1.5 times that for Task 1.

Before this study, Monkey J had participated in experiments using Task 2 and had no exposure to Task 1. For the current study, the animal was first trained with Task 1 alone and then with the two tasks randomly interleaved. Monkey G had participated in different experiments using simultaneous offers (Task 1) or sequential offers (Task 2). For the current study, the animal was trained to perform the two choice tasks randomly interleaved.

Across sessions, we used the following juices (colors): lemon Kool-Aid (bright yellow), grape juice (bright green), cherry juice (diluted to 3/4 with water or no dilution, red), peach juice (diluted to 3/4 with water, rose), fruit punch (diluted to 1/3 with water, magenta), apple juice (diluted to 1/2 with water, dark green), cranberry juice (diluted to 1/3 with water, pink), peppermint tea (bright blue), kiwi punch (dark blue), watermelon Kool-Aid (lime), and slightly salted water (0.65 g/l concentration, light gray).

##### Behavioral analysis

Choices in the two tasks were analyzed separately with probit regressions. For Task 1, we used the following model:
*choice B* = 1 if the animal chose juice B and 0 otherwise, Φ was the cumulative function of the standard normal distribution, and *q _{A}* and

*q*were the quantities of juices A and B offered. From the fitted parameters, we derived measures for the relative value of the juices ρ

_{B}*=*

_{Task 1}*exp*(–

*a*) and the sigmoid steepness η

_{0}/a_{1}*.*

_{Task 1}= a_{1}For Task 2, we used the following probit model:
* _{order,AB}* = 1 for AB trials and 0 otherwise, and δ

*= 1 – δ*

_{order,BA}*. Thus, AB trials and BA trials were analyzed separately but assuming that the two sigmoids had the same steepness. From the fitted parameters, we derived measures for the relative value ρ*

_{order,AB}*=*

_{Task 2}*exp*(–

*a*), the sigmoid steepness η

_{2}/a_{3}*, and the order bias ε =*

_{Task 2}= a_{3}*2*ρ

*. The order bias was defined such that ε < 0 (ε > 0) indicated a bias in favor of offer1 (offer2). We also defined the relative values specific to AB trials and BA trials as ρ*

_{Task 2}a_{4}/a_{3}*(–(*

_{AB}= exp*a*)

_{2}+a_{4}*/a*) and ρ

_{3}*(–(*

_{BA}= exp*a*)

_{2}-a_{4}*/a*). Of note, the order bias was defined such that ε ≈ ρ

_{3}*.*

_{BA}– ρ_{AB}In some cases, one or both choice patterns presented complete or quasi-complete separation (i.e., the animal split choices for ≤1 offer types). In these cases, the fitted steepness (η) was high and unstable. We identified outlier sessions using an interquartile criterion. Defining IQR as the interquartile range, values below the first quartile minus 1.5 × IQR or above the third quartile plus 1.5 × IQR were identified as outliers and removed from the behavioral analysis (Fig. 1*D–F*). This criterion excluded 14 of 115 sessions for Monkey J and 51 of 191 sessions for Monkey G. Including all sessions in the analysis did not substantially change the results. Importantly, data from all sessions were included in the neuronal analyses.

##### Neuronal recordings

Neural recordings focused on area 13m in the central orbital gyrus (Ongur and Price, 2000). We recorded from both hemispheres of Monkey J (left: AP 31:35, ML –8:–10; right: AP 31:35, ML 6:10) and both hemispheres of Monkey G (left: AP 31:36, ML –7:–12; right: AP 31:36, ML 4:9). Tungsten single electrodes (100 µm shank diameter; FHC) were advanced remotely using a custom-built motorized micro-drive (step size 2.5 µm). Typically, one motor advanced two electrodes placed 1 mm apart, and 1 or 2 such pairs of electrodes were advanced unilaterally or bilaterally in each session. Each electrode would usually record the activity of 1-2 cells (average 1.25 cells/electrode). Amplified signals (gain: 10,000) were filtered (high-pass cutoff: 300 Hz; low-pass cutoff: 6 kHz; Lynx 8, Neuralynx), digitized (frequency: 40 kHz), and saved to disk (Power 1401, Cambridge Electronic Design). Spike sorting was performed offline (Spike 2 version 6, Cambridge Electronic Design). Only cells that appeared well isolated and stable throughout the session were included in the analysis.

##### Neuronal classification within task modality

For each neuron, trials from Task 1 and Task 2 were first analyzed separately using the procedures developed in previous studies (Padoa-Schioppa and Assad, 2006; Ballesta and Padoa-Schioppa, 2019). For Task 1, we defined four time windows: post-offer (0.5 s after offer onset), late-delay (0.5-1 s after offer onset), pre-juice (0.5 s before juice onset), and post-juice (0.5 s after juice onset). A “trial type” was defined by two offered quantities and a choice. For Task 2, we defined three time windows: post-offer1 (0.5 s after offer1 onset), post-offer2 (0.5 s after offer2 onset), and post-juice (0.5 s after juice onset). A “trial type” was defined by two offered quantities, their order and a choice. For each task, each trial type, and each time window, we averaged spike counts across trials. A “neuronal response” was defined as the firing rate of one cell in one time window as a function of the trial type. Neuronal responses in each task were submitted to an ANOVA (factor: trial type). Neurons passing the *p* < 0.01 criterion in at least one time window in either task were identified as “task-related” and included in subsequent analyses.

Following previous work (Padoa-Schioppa and Assad, 2006; Padoa-Schioppa, 2013), neurons in Task 1 were classified in one of four groups: offer value A, offer value B, chosen juice, or chosen value. Each variable could be encoded with positive or negative sign, leading to a total of 8 cell groups. For the classification, we proceeded as follows. Each neuronal response was regressed against each of the four variables defined in Table 1. If the regression slope *b _{1}* differed significantly from zero (

*p*< 0.05), the variable was said to “explain” the response. In this case, we set the signed

*R*

^{2}as

*sR*= sign(

^{2}*b*)

_{1}*R*

^{2}; if the variable did not explain the response, we set

*sR*= 0. After repeating the operation for each time window, we computed for each cell and for each variable the

^{2}*sum*(

*sR*) across time windows. Neurons explained by at least one variable in one time window were said to be tuned; other neurons were labeled “untuned.” Tuned cells were assigned to the variable and sign providing the maximum |

^{2}*sum*(

*sR*)|, where |·| indicates the absolute value. Indicating with “+” and “–” the sign of the encoding, each neuron was thus classified in 1 of 9 groups: offer value A+, offer value A–, offer value B+, offer value B–, chosen juice A, chosen juice B, chosen value+, chosen value–, and untuned.

^{2}Neuronal classification in Task 2 followed the procedures described by Ballesta and Padoa-Schioppa (2019). Under sequential offers, neuronal responses in OFC were found to encode different variables defined in relation to the presentation order (AB or BA). Specifically, the vast majority of responses were explained by 1 of 11 variables defined in Table 1. These included one binary variable capturing the order (*AB* | *BA*), six variables representing individual offer values (offer value *A* | *AB*, offer value *A* | *BA*, offer value *B* | *AB*, offer value *B* | *BA*, offer1 value and offer2 value three variables capturing variants of the chosen value (chosen value, chosen value A, chosen value B), and a binary variable representing the binary choice outcome (chosen juice). Each of these variables could be encoded with a positive or negative sign. Most neurons appeared to encode different variables in different time windows. In principle, considering 11 variables, 2 signs of the encoding and 3 time windows, neurons might present a very large number of variable patterns across time windows. Remarkably, however, the vast majority of OFC neurons presented 1 of 8 patterns. These patterns are referred to as sequences and defined in Table 2. Thus, we classified each cell as encoding 1 of these 8 sequences. For each cell and each time window, we regressed the neuronal response against each of the variables predicted by each sequence. If the regression slope *b _{1}* differed significantly from zero (

*p*< 0.05), the variable was said to explain the response and we set the signed

*R*

^{2}as

*sR*= sign(

^{2}*b*)

_{1}*R*

^{2}; if the variable did not explain the response, we set

*sR*= 0. After repeating the operation for each time window, we computed for each cell the

^{2}*sum*(

*sR*) across time windows for each of the 8 sequences. Neurons such that

^{2}*sum*(

*sR*) ≠ 0 for at least one sequence were said to be tuned; other neurons were untuned. Tuned cells were assigned to the sequence that provided the maximum |

^{2}*sum*(

*sR*)|. As a result, each neuron was classified in 1 of 9 groups: seq #1, seq #2, seq #3, seq #4, seq #5, seq #6, seq #7, seq #8, and untuned.

^{2}##### Comparing classification across choice task

We aimed to ascertain the relation between the classifications obtained for Task 1 and Task 2. To do so, we used statistical analyses for categorical data (Agresti, 2019). First, we constructed a 9 × 9 contingency table in which rows and columns represented, respectively, the cell classes defined in Task 1 and Task 2, and each entry indicated the number of neurons with the corresponding classifications. Second, to estimate whether the cell count obtained for any particular pair of classes departed from chance level, we computed a table of odds ratios (*OR*s). For each location (*i, j*) in the contingency table, *X _{i,j}* indicated the number of cells classified as class

*i*in Task 1 and as class

*j*in Task 2. We defined the following:

The corresponding *OR* was defined as follows:

The *OR* was calculated for each entry of the contingency table. We thus obtained a 9 × 9 table. For each entry (*i, j*), *OR _{i,j}* = 1 was the chance level. Conversely,

*OR*> 1 (

_{i,j}*OR*< 1) indicated that the number of neurons classified as (

_{i,j}*i*,

*j*) was higher (lower) than expected by chance based on the number of cells in class

*i*and the number of cells in class

*j*. To assess whether departures from chance level were statistically significant, we used the two-tailed Fisher's exact test, separately for each entry.

To compare the across-tasks table to some benchmark, we created two within-task tables. For each choice task and each trial type, we randomly divided trials in two sets (1 and 2). Pooling trial types, we obtained two complete sets of trials (Set 1 and Set 2). This procedure ensured that each set had the same number of trial types. For Task 1 data, we repeated the cell classification procedure described above separately for each trial set. We thus generated the within-task contingency table and the table of *OR*s comparing the results obtained for Sets 1 and 2. We repeated these operations for Task 2 data. To assess whether the two within-task tables of *OR*s and the across-tasks table of *OR*s differed significantly from each other, we used the Breslow-Day test (Breslow and Day, 1980). In essence, this test examines the homogeneity of *OR*s, with the null hypothesis that different strata are statistically identical. In our case, the null hypothesis was that, for each location in the *OR* table, the three measures (two within-task and one across-task) were statistically identical. The Breslow-Day test is ultimately a χ^{2} test. Its statistic has an asymptotic χ^{2} distribution with *k*–1 degrees of freedom. Here the test was conducted entry by entry, with df = 2, and *p* < 0.01 identified statistical significance.

Following the results presented in this study, we proceeded with a comprehensive (“final”) classification based on the activity recorded in both tasks. For each task-related cell, we calculated the *sum*(*sR ^{2}*) for the eight variables in Task 1 (

*sum*(

*sR*)

^{2}*) and eight sequences in Task 2 (*

_{Task 1}*sum*(

*sR*)

^{2}*) as described above. We then added the corresponding*

_{Task 2}*sum*(

*sR*)

^{2}*and*

_{Task 1}*sum*(

*sR*)

^{2}*to obtain the*

_{Task 2}*sum*(

*sR*)

^{2}*. Neurons such that*

_{final}*sum*(

*sR*)

^{2}*≠ 0 for at least one class were said to be tuned; other neurons were untuned. Tuned cells were assigned to the cell class that provided the maximum |*

_{final}*sum*(

*sR*)

^{2}*|.*

_{final}##### Selective activity range (*SAR*)

In each task, neurons respond to the encoded variables in multiple time windows. The strength of the encoding, referred to as selectivity, varies across windows and from cell to cell. Thus, for each cell and for each task, one might identify the time window with maximum selectivity. We examined whether there was a systematic relationship between the maximum selectivity window (*MSW*) measured for any given cell in the two tasks. To do so, we first defined the activity range (*AR*). For each cell and each time window, we performed the linear regression as follows:
*fr* was the firing rate, *EV* was the encoded variable, and *b _{0}* and

*b*were the fitted parameters. Indicating the minimum and maximum of

_{1}*EV*, respectively, as

*EV*and

_{min}*EV*, we computed Δ

_{max}*EV*=

*EV*–

_{max}*EV*. The activity range was defined as

_{min}*AR*= |

*b*|, where |·| indicates the absolute value. We also defined the

_{1}ΔEV*SAR*as follows:

*P*= 1 if the response passed the ANOVA criterion and 0 otherwise, and

*N*= 1 if the slope of the encoded variable differed significantly from zero and 0 otherwise.

## Results

Two monkeys chose between different juices offered in variable amounts. Offers were represented by sets of colored squares displayed on a computer monitor, and animals indicated their choice with an eye movement. In each session, trials with two choice modalities were randomly interleaved. In one modality (Task 1), two offers were presented simultaneously (Fig. 1*A*); in the other modality (Task 2), two offers were presented in sequence (Fig. 1*B*). A cue presented at the beginning of the trial indicated to the animal the choice modality for that trial. The two juices used in each session were labeled A and B, with A preferred, and we indicated the quantities offered in any given trial with *q _{A}* and

*q*. For Task 2, trials in which juice A was offered first and trials in which juice B was offered first were referred to as AB trials and BA trials, respectively. The first and second offers were referred to as offer1 and offer2, respectively.

_{B}### Comparing choices across tasks

Our dataset included 306 sessions from 2 monkeys (115 from Monkey J, 191 from Monkey G). Sessions included 216-880 trials (mean ± SD = 590 ± 160). For each session, we analyzed trials with the two choice tasks separately using probit regressions (see Materials and Methods). For Task 1 (simultaneous offers), the probit fit provided measures for the relative value ρ* _{Task 1}* and the sigmoid steepness η

*. For Task 2 (sequential offers), the probit fit provided measures for the relative value ρ*

_{Task 1}*, the sigmoid steepness η*

_{Task 2}*, and the order bias ε (Fig. 1*

_{Task 2}*C–F*). Intuitively, the relative value was the quantity ratio

*q*/

_{B}*q*that made the animal indifferent between the two juices, the sigmoid steepness was inversely related to choice variability, and the order bias (measured in Task 2) was a bias favoring the first or the second offer. Specifically, ε < 0 indicated a bias favoring offer1 and ε > 0 indicated a bias favoring offer2.

_{A}The experimental design gave us the opportunity to compare choices across tasks. Our analyses revealed several phenomena. First, the relative values measured in the two tasks were very similar and highly correlated across sessions (*r* > 0.90; Fig. 1*D*). At the same time, ρ* _{Task 1}* and ρ

*presented some differences. Specifically, relative values in Task 2 were generally higher than in Task 1 (*

_{Task 2}*p*< 10

^{−10},

*t*test), and this effect increased with the relative value. Second, sigmoids measured in Task 2 were significantly shallower compared with Task 1 (

*p*< 10

^{−25},

*t*test; Fig. 1

*E*). In other words, presenting offers in sequence substantially increased choice variability. Third, in Task 2, animals showed an order bias favoring offer2 (Fig. 1

*F*). This effect was highly significant (

*p*< 10

^{−25},

*t*test) but quantitatively modest (mean(ε) = 0.26 uB) compared with relative values, which typically ranged between 1 and 4 uB (mean(ρ) = 2.26 uB).

These three behavioral phenomena (larger choice variability, preference bias, and order bias) were likely because of the higher cognitive demands imposed by Task 2 (see Discussion). Importantly, these effects were relatively small and essentially orthogonal to the main question addressed in this study, concerning the relation between cell groups recorded in the two choice tasks. Thus, for the analyses of neuronal activity presented in the rest of this study, we examined responses of each neuron in each task in relation to variables defined based on the relative value measured in the same task, ignoring the order bias (Table 2).

### Neuronal classification in each choice task

Previous studies of choices under simultaneous offers identified in OFC different groups of cells encoding individual offer values, the binary choice outcome and the chosen value (Padoa-Schioppa and Assad, 2006). Similarly, recent work on choices under sequential offers identified different groups of cells encoding different decision variables (Ballesta and Padoa-Schioppa, 2019). Our goal was to ascertain whether the two sets of cell groups correspond to each other. To do so, we recorded and analyzed the activity of 1526 cells (672 cells from Monkey J and 854 cells from Monkey G). In the analysis, our general strategy was to classify cells separately in each task according to the same criteria used in previous work, and to then compare the results of the two classifications at the population level. Thus, we divided trials with Task 1 and Task 2 and proceeded in steps.

For Task 1 trials, we defined four 500 ms time windows aligned with the offer presentation (post-offer, late-delay) and the juice delivery (pre-juice and post-juice). A “trial type” was defined by two offers and a choice. For Task 2 trials, we defined three 500 ms time windows aligned with the two offers (post-offer1, post-offer2) and with the juice delivery (post-juice). A “trial type” was defined as two offers in a particular order and a choice. For each task, each trial type, and each time window, we averaged spike counts across trials. In each task, a neuronal response was defined as the firing rate of one cell in one time window as a function of the trial type. Neuronal responses were submitted to an ANOVA (factor: trial type). Neurons presenting a significant modulation (*p* < 0.01) in at least one task and at least one time window were identified as task-related and included in subsequent analyses. In total, 645 of 1526 (42%) cells met this criterion. Further analyses were restricted to this population.

While inspecting individual responses, we made three observations. First, replicating several previous studies, responses in Task 1 appeared to encode one of the variables offer value, chosen juice, or chosen value (Fig. 2). Second, confirming the results of our recent study on sequential offers, neurons in Task 2 appeared to encode different variables in different time windows. Across time windows, particular sequences of variables were most frequent. For example, in the three time windows under consideration, the neuron in Figure 2*C* encoded variables offer value B | BA, offer value B | AB, and chosen value B. These variables define sequence #3 in Table 2. In the same time windows, the cell in Figure 2*F* encoded variables –AB|BA, AB|BA, and chosen juice B. These variables define sequence #5. Similarly, the cell in Figure 2*I* encoded variables offer1 value, offer2 value and chosen value, which define sequence #7. Third and most important, there appeared to be a reliable correspondence between neuronal responses recorded in the two tasks. In principle, neurons tuned in one task could be untuned in the other task. That is, different cell assemblies in OFC could support choices in the two tasks. In contrast, neurons were typically tuned in both tasks or not at all. Furthermore, the variable encoded in Task 1 corresponded to specific sequences encoded in Task 2. For example, neurons encoding offer value A in Task 1 typically encoded sequence #1 in Task 2; neurons encoding offer value B in Task 1 typically encoded sequence #3 in Task 2; neurons encoding chosen juice A in Task 1 typically encoded sequence #5 in Task 2; etc. The three example cells in Figure 2 illustrate this point.

For a statistical analysis, we classified neurons in Task 1 and Task 2 following the same procedures of previous studies (Padoa-Schioppa, 2013; Ballesta and Padoa-Schioppa, 2019). For Task 1, we regressed each response against each variable. Each regression provided a slope and the *R*^{2}. If the slope differed significantly from zero (*p* < 0.05), the variable was said to explain the response. If the slope was statistically indistinguishable from zero, we set *R*^{2} = 0. We considered the signed *sR*^{2}, where the sign was obtained from the regression slope, summed it over time windows, took the absolute value, and assigned each neuron to the variable providing the maximum |*sum(sR ^{2})*| (see Materials and Methods). Task-related cells not explained by any variable in any time window were labeled “untuned.” For Task 2, we used a very similar procedure, except that, for any of the 8 sequences, different variables were examined in different time windows (Table 2). Again, each neuron was assigned to the sequence providing the maximum |

*sum(sR*|, where

^{2})*sR*is the signed

^{2}*R*

^{2}and the sum is across time windows.

### Matching classifications across choice tasks

To compare the results across tasks, we constructed a contingency table where rows represented classes in Task 1, columns represented classes in Task 2, and in each entry quantified the cell count. We envisioned three possible scenarios illustrated in Figure 3: (1) the table might be concentrated on the first row and first column (Fig. 3*A*), indicating that the two tasks engage different neuronal populations; (2) the table might present a uniform distribution (Fig. 3*B*), indicating that the two tasks engage the same neuronal population but the role of individual neurons differs across task; or (3) the contingency table might be concentrated on the diagonal (Fig. 3*C*), indicating that individual neurons have the same role in the two choice tasks.

Figure 4*A* illustrates the cell counts actually measured in the experiments. The vast majority of neurons were either non–task-related (881 of 1526 = 58%) or tuned in both tasks (490 of 1526 = 32%). Importantly, different groups of cells accounted for different numbers of neurons. Thus, to compare each cell count to chance level, we computed for each entry the *OR* (see Materials and Methods). We thus obtained a table of *OR*s (Fig. 4*B*). For each entry, *OR* = 1 was chance level; conversely, *OR* > 1 or *OR* < 1 indicated that the cell count was above or below that expected by chance, respectively. For each entry, a Fisher's exact test (*p* < 0.01) assessed whether departure from chance was statistically significant (Fig. 4*C*). Inspection of Figure 4*B* reveals that cell counts were significantly above chance for all entries on the diagonal. Conversely, the vast majority of off-diagonal entries (69 of 72) was at or below chance level. In conclusion, there was a strong correspondence between the class identified for any given cell in Task 1 and that identified for the same cell in Task 2.

We noted that a few off-diagonal entries in Figure 4*B* were significantly above chance. We conducted two analyses to assess the significance of this observation. First, we examined whether this finding could be explained by the correlation between different variables defined in Table 1. This correlation and the intrinsic variability of neuronal firing rates likely induced some misclassification. We thus expected that instances for which *OR _{i,j}* was significantly >1 would occur only when the variables indexed by

*i*and

*j*were highly correlated with each other. To test our hypothesis, we generated the correlation matrix

*C*, separately for each task. For Task 1, entry

*C*in this matrix was simply the correlation between variables

_{m,n}*m*and

*n*, which did not depend on the time window. For Task 2, since each sequence included different variables in different time windows, we computed the correlation matrix separately in each time window using the relevant variables. We then calculated the mean correlation matrix across time windows. The correlation matrices obtained for the two tasks were similar, and we averaged them to obtain the average correlation matrix, referred to as table

*Z*. Of note, table

*Z*was symmetric by construction (Fig. 5

*A*). Inspection of it reveals that correlations between specific pairs of variables were particularly high. For example, variables offer value A+ and chosen value A were highly correlated (

*r*= 0.69). Similarly, variables offer value B+ and chosen value B were highly correlated (

*r*= 0.63). To assess the relation between

*OR*table and

*Z*table, we plotted them against each other entry by entry (excluding the diagonals; Fig. 5

*B*). The two tables were highly correlated (Pearson:

*r*= 0.75,

*p*= 4 10

^{−11}). Furthermore, significant departure from chance level in the

*OR*table (

*OR*> 1) occurred only when the variable correlation was particularly high (

_{i,j}*Z*> 0.5).

_{i,j}Second, to compare the results in Figure 4*B* to some benchmark, we generated equivalent *OR* tables separately for each choice task. For Task 1, we divided trials randomly in two sets (Set 1 and Set 2; see Materials and Methods). We analyzed the two sets of trials separately and thus obtained two independent classifications. We repeated this operation for each cell in the population, and generated a contingency table (not shown) and a table of *OR*s (Fig. 6*A*) where rows and columns corresponded to Set 1 trials and Set 2 trials, respectively. We repeated this analysis for data from Task 2 and obtained an equivalent *OR* table (Fig. 6*B*). Since the two sets of trials were interleaved and the criterion used to separate them was arbitrary, we expected the *OR* tables to concentrate on the diagonal. Conversely, non-zero off-diagonal entries should capture noise in the classification procedures due to the correlation between encoded variables (Fig. 5*A*) and to trial-to-trial variability in spike counts. To assess whether the table in Figure 4*B* (across tasks) differed significantly from the tables in Figure 6*A*, *B* (within task), we used a Breslow-Day test (see Materials and Methods). Figure 6*C* illustrates the results of this analysis. In essence, classifications across tasks were as consistent as classifications within tasks (all *p* ≥ 0.01).

In all previous analyses, neurons were classified based on the activity recorded in multiple time windows. For a control, we repeated the analysis of Figure 4 using only one time window for each task. We then matched time windows across tasks. Indicating with [*x*, *y*], the pair formed by time windows *x* (Task 1) and *y* (Task 2), we examined the three pairs [post-offer, post-offer1], [post-offer, post-offer2], and [post-juice, post-juice]. In general, the results based on a single time window were similar to those based on multiple windows, albeit noisier. For example, considering the time window pair [post-offer, post-offer1], all the diagonal entries in the *OR* table were significantly above chance (along with a few off-diagonal entries). The two other pairs of time windows provide similar pictures. These findings confirmed that there was a strong correspondence between the cell classification in the two choice tasks.

Together, our results indicated that the cell groups identified under sequential offers are equivalent to those identified under simultaneous offers. Building on this finding, we proceeded with a comprehensive classification based on both choice tasks, by summing the *R*^{2} across all seven time windows (see Materials and Methods). Henceforth, we may refer to the different groups of cells using the standard nomenclature (offer value, chosen juice, and chosen value) independently of the choice task. In total, the final classification resulted in 235 offer value cells, 168 chosen juice cells, and 233 chosen value cells.

### Matching *MSW*s across choice tasks

To complement the results described above, we examined whether there was some correspondence between the time windows in which any given neuron was most selective in Task 1 and in Task 2. To address this issue, we defined for each cell, each task and each time window the *SAR*, which captured the strength of the encoding (see Materials and Methods). For each cell and each task, the *MSW* may be defined as the time window for which *SAR* was maximal. Because neurons often responded similarly in different time windows (e.g., post-offer1 and post-offer2 in Task 2), we used a soft definition and identified as *MSW* any time window such that *SAR/max*(*SAR*) > 0.6.

To compare the *MSWs* identified for each cell in the two tasks across the population, we generated a contingency table where rows and columns represented time windows in the two tasks and each entry indicated the number of cells with corresponding *MSW*s (Fig. 7*A*). We also computed the corresponding table of *OR*s (Fig. 7*B*), and the *p* values obtained from Fisher's exact tests (Fig. 7*C*). The results indicated that time windows for which neurons were maximally selective in the two tasks were systematically related across the population. Specifically, cells for which *MSW* = post-offer in Task 1 typically had *MSW* = post-offer1 and/or *MSW* = post-offer2 in Task 2. Similarly, cells for which *MSW* = post-juice in Task 1 typically had *MSW* = post-juice in Task 2. This finding supports the understanding that individual neurons have similar functions in the two choice tasks.

## Discussion

The past 20 years witnessed enormous progress in the understanding of the cognitive and neural underpinnings of economic choices. An extensive body of work demonstrates beyond reasonable doubt that subjective values are explicitly represented at the neuronal level (Padoa-Schioppa, 2007; Kable and Glimcher, 2009; O'Doherty, 2014; Perkins and Rich, 2021). Furthermore, substantial evidence links economic decisions to neuronal activity in the OFC. Neurons in this area represent different decision variables in a categorical way (Hirokawa et al., 2019; Onken et al., 2019). In particular, when monkeys (or mice) choose between juices, different groups of cells encode individual offer values, the binary choice outcome and the chosen value (Padoa-Schioppa and Assad, 2006; Kuwabara et al., 2020). These variables capture both the input and the output of the choice process, suggesting that the cell groups identified in OFC might constitute the building blocks of e decision circuit. The population dynamics (Rich and Wallis, 2016), correlations between neuronal and behavioral variability (Padoa-Schioppa, 2013), the effects of lesion (Camille et al., 2011; Yu et al., 2018) or inactivation (Gore et al., 2019; Kuwabara et al., 2020), and computational modeling (Rustichini and Padoa-Schioppa, 2015; Song et al., 2017; Zhang et al., 2018) support this proposal. These and corroborating results set the stage for a detailed understanding of the decision mechanisms. One important caveat is that current notions came primarily from studies in which two offers were presented simultaneously. Yet, in many daily choices, offers appear or are examined sequentially, and some authors suggested that choices under sequential offers may rely on fundamentally different mechanisms (Kacelnik et al., 2011; Hunt et al., 2013; Hayden and Moreno-Bote, 2018). Thus, the purpose of this study was to assess whether choices under sequential and simultaneous offers engage the same neural circuit. In a previous study, we recorded from the OFC under sequential offers. Through an analysis of neuronal responses across time windows, we identified different groups of neurons encoding different sequences of decision variables (Ballesta and Padoa-Schioppa, 2019). Importantly, since any choice task engages only a subset of neurons, it remained unclear whether choices under sequential or simultaneous offers rely on the same neuronal population, or whether the functional role of any given cell would be preserved across choice modalities. In the present study, we alternated two choice tasks on a trial-by-trial basis. In a nutshell, we found a strong correspondence between the cell groups previously identified in the two conditions. In other words, choices under sequential or simultaneous offers appear to rely on the same neural circuit. This result indicates that notions emerging from studies of choices under simultaneous offers generalize to a much broader domain of choices than previously recognized.

An interesting question is whether the neuronal populations described here might also subserve foraging choices, such as those made by an animal that could continue to exploit the current food patch or leave it in search of better but riskier opportunities. Such choices are sometimes construed as “yes-or-no” and distinguished from binary choices of the sort examined in our experiments (Kolling et al., 2012; Hayden and Moreno-Bote, 2018). However, what appears as a yes-or-no choice could also be construed as a binary choice between two offers: one unambiguous (the current patch) and one more ambiguous (a probability distribution over other possible patches and times necessary to reach them). In economics, the value of the latter offer is often referred to as an opportunity cost. The question of how the brain treats patch-leaving choices (as yes-or-no or as binary) remains open. If patch-leaving choices are treated as binary, it seems reasonable to assume that the neuronal populations identified in our studies play the same role when choices are about leaving a food patch. Conversely, if the brain treats patch-leaving choices as qualitatively different, such choices might rely on different neuronal mechanisms. These intriguing questions remain open for future research.

Alternating the two tasks within each session gave us the opportunity to compare choices in a controlled way. We thus discovered three interesting phenomena. Under sequential offers, (a) choices were more variable, (b) relative values were higher (preference bias), and (c) choices were biased in favor of the second offer (order bias). The last observation confirms previous reports (Krajbich et al., 2010; Ballesta and Padoa-Schioppa, 2019; A. Rustichini, personal communication). At the cognitive level, these phenomena may be understood as follows. The difference in choice variability (a) may be interpreted noting that choices under sequential offers were cognitively more demanding because they required holding in working memory the value of offer1, comparing the values of two goods when only offer2 was visible, remembering the chosen juice for an additional delay, and mapping that choice onto the appropriate saccade target. Each of these mental operations could contribute to choice variability. Along similar lines, the preference bias (b) may reflect the higher cognitive demands of Task 2. In particular, we note that, when the two offer targets appear on the monitor, information about the two values is no longer on display on the monitor. If at that point the animal has not finalized its decision, or if it has failed to retain in working memory the decision outcome, it makes sense to choose the target associated with the more valuable juice (juice A). Finally, the order bias (c) may be interpreted noting that decisions in Task 2 were made shortly after offer2 appeared on the monitor, when that offer was perceptually most salient. Thus, a choice bias favoring offer2 is not surprising. The neuronal origins of choice biases, including the phenomena documented here, remain an important and open question for future work.

## Footnotes

This work was supported by National Institutes of Health Grants R01-DA032758 and R01-MH104494 to C.P.-S.; and McDonnell Center for Systems Neuroscience Predoctoral Fellowship to W.S. We thank H. Schoknecht for help with animal trainings; and A. Livi, M. Zhang, and J. Tu for comments on the manuscript.

The authors declare no competing financial interests.

- Correspondence should be addressed to Camillo Padoa-Schioppa at camillo{at}wustl.edu