During behavior, the oculomotor system is tasked with selecting objects from an ever-changing visual field and guiding eye movements to these locations. The attentional priority given to visual targets during selection can be strongly influenced by external stimulus properties or internal goals based on previous experience. Although these exogenous and endogenous drivers of selection are known to operate across partially overlapping timescales, the form of their interaction over time remains poorly understood. Using a novel choice task that simultaneously manipulates stimulus- and goal-driven attention, we demonstrate that exogenous and endogenous attentional biases change linearly as a function of time after stimulus onset and have an additive influence on the visual selection process in rhesus macaques (Macaca mulatta). We present a family of computational models that quantify this interaction over time and detail the history dependence of both processes. The computational models reveal the existence of a critical 140–180 ms attentional “switching” time, when stimulus- and goal-driven processes simultaneously favor competing visual targets. These results suggest that the brain uses a linear sum of attentional biases to guide visual selection.
Visual selection is the process by which the brain's attentional mechanisms target one location in the visual field for the purpose of perceptual enhancement or saccade planning (Awh et al., 2006). In the primate oculomotor system, two distinct attentional processes are known to drive visual selection: sensory-driven exogenous (“bottom-up”) attention and goal-driven endogenous (“top-down”) attention (Desimone and Duncan, 1995; Egeth and Yantis, 1997; Kastner and Ungerleider, 2000; Awh et al., 2006; Knudsen, 2007). Exogenous attention is automatic and can reliably be “captured” by a flashed object (Yantis and Jonides, 1984, 1996; Nakayama and Mackeben, 1989; Egeth and Yantis, 1997) or pop-out stimulus (Joseph and Optican, 1996), even if these cues are uninformative or task-irrelevant (Liu et al., 2005; Giordano et al., 2009). Endogenous attention is a voluntary process that supports the monitoring of peripheral targets or locations, has been shown to improve discriminability and speed of information accrual at monitored locations, and varies flexibly with task demands such as cue validity (Giordano et al., 2009).
Recent work has shown that sensory- and goal-driven attention are subserved by distinct brain mechanisms (Kastner and Ungerleider, 2000; Corbetta and Shulman, 2002; Giordano et al., 2009; Ross et al., 2010). These mechanisms are likely to interact, based on the observation that exogenous attention to a distractor location interferes with endogenous attention to a target located elsewhere in the visual field (Theeuwes and Burger, 1998). However, the nature of this interaction remains unknown, because endogenous and exogenous manipulations of competition have been studied in isolation only (Beck and Kastner, 2009). How is competition resolved when stimulus- and goal-driven factors simultaneously drive the selection of different targets in the visual field?
To address this problem, we developed a simple two-target, free reaction time decision task, in which we simultaneously manipulate sensory- and goal-driven attentional processes by varying the relative luminance and relative reward values of the targets. Through parametric variation of luminance contrast and expected reward, and by using reaction time as a proxy for internal selection dynamics, we investigate how attentional biases derived from these stimulus properties evolve in time after target onset. In particular, when luminance and reward favor different targets, we find that the selected target location is strongly influenced by reaction time: fast reaction times lead to a stronger sensory-driven attentional bias, while slow reaction times lead to a stronger goal-driven attentional bias. We present a family of computational models to quantify the interaction between luminance and reward biases in time, as well as their dependence on prior experience. Our best-fitting model demonstrates that bottom-up and top-down biases combine linearly at all times to drive visual selection, and although reward bias is shaped by previous experience, luminance bias is not.
Materials and Methods
Two adult male rhesus macaques (Macaca mulatta) participated in the study (monkey A and monkey S, 9.5 kg of and 8.4 kg, respectively at the start of the experiments). Both animals had been used previously in other experiments studying eye movements but were naive to the choice task used in this study. Identical training protocols were used for both animals (see below). Before behavioral training, each animal was instrumented with a head restraint prosthesis to allow fixation of head position and tracking of eye position. All surgical and animal care procedures were approved by the New York University Animal Care and Use Committee and were performed in accordance with the National Institute of Health guidelines for care and use of laboratory animals.
Each monkey was behaviorally trained for several weeks in an unlit sound-attenuated room (ETS Lindgren). Eye position was constantly monitored with an infrared optical eye tracking system sampling at 120 Hz (ISCAN). Eye positions were digitized at 1 kHz. Visual stimuli were presented on an LCD screen (Dell Inc) placed 34 cm from the subjects' eyes. The visual stimuli were controlled via custom LabVIEW (National Instruments) software executed on a real-time embedded system (NI PXI-8184, National Instruments).
Luminance-reward selection task
To test a simple conceptual model of selection (Fig. 1a), each monkey performed the two-alternative choice task shown in Figure 1b. Two identically sized rectangular stimuli with a 3-to-1 aspect ratio served as the targets in this task, with each target associated with a different value of liquid reward. The long axis of each target subtended 2° of visual arc. Target 1 (T1) was oriented so that the long axis was vertical, and target 2 (T2) was oriented so that the long axis was horizontal (Fig. 1b). The monkeys were motivated to find the target associated with the highest value of liquid reward. The mean value of the liquid reward associated with each target was kept constant for blocks of 40–70 trials, after which a new mean for each target was assigned. We randomized the number of trials in each block to discourage an influence by the number of trials completed in a block. The mean of the reward values tested varied between 0.04 ml/trial and 0.21 ml/trial across blocks. Changes in reward magnitude between block transitions were unsignaled, and a Gaussian-distributed variability (SD = 0.015 ml) was added to the value associated with both targets on every trial. Adding variability to reward magnitude across trials ensured that we could perform regression analyses and increased the subjects' uncertainty about the times of reward block transitions.
The luminance of T1 was randomly chosen on each trial from a log-uniform distribution of values ranging from 0.01 to 12.15 cd/m2. The minimum of this distribution was set above the psychophysical threshold for stimulus detection during a single-target delayed saccade task for both monkeys. After the luminance parameter for T1 was chosen, the luminance of T2 was assigned such that the mean luminance across both targets was 6 cd/m2. Although the luminance of one target was informative about the luminance of the other target, the randomized target locations guaranteed that subjects could not determine the location of a dim target from the location of a bright target. On each trial, target luminance values were chosen independently from the rewards associated with T1 and T2.
Each monkey performed saccadic eye movements for liquid rewards. The monkeys started each trial by placing both hands on proximity sensors, after which a red square was centrally presented. The monkeys were required to fixate within 2° of the center of the red square for a 500–800 ms baseline period. After the baseline, the central red square was extinguished and two red targets (T1 and T2) were presented at random locations in the visual periphery at a 10° eccentricity from the central fixation. We randomized the spatial locations of each target on each trial to reduce the influence of previous experience on the allocation of spatial attention at the start of each trial. The separation between target pairs was constrained to be at least 90° on each trial. Target onset cued the subjects to perform a free-choice saccade to one of the two targets. After the saccade was completed, fixation was maintained for 300 ms at the chosen target, following which the appropriate reward was delivered <500 ms after the eye movement was completed. Each trial lasted 890–1400 ms, and only one choice could be made per trial. Trials were separated by a 1000–1500 ms intertrial interval (ITI) beginning at the end of the time of reinforcer delivery. No visual stimuli were presented during the ITI. The range of trial durations derives from the variability in the amount of time taken by the monkeys to select and execute their eye movements. Relative to the duration of the trial, this time was short (mean ± SD reaction time 168 ± 31 ms for monkey A, 192 ± 30 ms for monkey S). Reaction times shorter than 100 ms, mediated by express saccades (Sommer, 1997), were rare in data from each animal: 1060 trials in monkey A and 480 trials in monkey S, or 3% and 1% of all trials, respectively. These data were included in all model fits shown here, and our results were unaffected by their exclusion.
A trial was aborted if the monkey failed to align its gaze within 2° of the center of the fixation or choice targets. When an abort was detected, all visual stimuli were extinguished immediately, no reinforcers were delivered, and the trial was restarted after a 1200–1800 ms intertrial interval. Both monkeys rarely aborted trials (4% for monkey A, 5% for monkey S). Aborted trials were excluded from further analyses. The data analyzed were 37,816 completed trials for monkey A (30,938 after excluding the first 10 trials from each block) and 54,026 completed trials for monkey S (43,774 after excluding the first 10 trials from each block). Data reported here were collected after at least 3 weeks of training on the choice task.
Computational models of choice behavior
We developed computational models of choice behavior to describe steady-state choice behavior and dynamic choice behavior.
Steady-state choice behavior.
A generalized linear model (GLM) was fit to steady-state choice behavior to explain choices in terms of choice biases derived from luminance contrast and reward difference after the current reward distribution was learned: where pT1 and pT2 are the probability of choosing T1 or T2, respectively, BL is the luminance-driven choice bias, and BR is the reward-driven choice bias. The exact form of the model is as follows: where L encodes the luminance contrast [log10(LT1/LT2)] on the current trial, R encodes the mean reward difference for the current block, and T encodes reaction time on the current trial. In all fits of this model, L and R are in the range [−1, 1], and T is in the range [0, 1]. This required us to map luminance contrast (L) from the domain (−2, 2) cd/m2 onto (−1, 1) for both monkeys. Reward difference (R) was mapped from (−0.2, 0.2) ml onto (−1, 1) for monkey A, and from (−0.07, 0.07) ml onto (−1, 1) for monkey S. This threefold decrease in range was used to correct for monkey S's comparatively higher sensitivity to reward differences.
To allow for the possibility that reward information does not become available until a fixed delay after target onset, the reward bias, BR, was treated as a piecewise linear function that has value β × R at reaction times between 0 and Tmin ms, and follows the time-dependent form R(β − γRT) when T > Tmin. To enforce this piecewise linearity, reaction time (T) was mapped from (128, 300) ms onto (0, 1) for monkey A, and from (81, 300) ms onto (0, 1) for monkey S. Reaction times less than Tmin = 128 or 81 ms, respectively, were linearly mapped onto negative values of T in the BL expression, but clamped to 0 in the BR expression. These Tmin values were chosen from the range 0–300 ms to minimize the deviance of model fits based on Equation 2 (see below). In practice, these optimal fits resulted in β ≈ 0 for both monkeys. Therefore, we omitted β from both steady-state fits with only minimal increase of deviance (Table 1). In all models, the fit quality was similar whether BR was piecewise linear or not, because most of the reaction times in our data are >128 ms.
The α and β coefficients in Equation 2 measure the initial choice bias derived from luminance contrast and reward difference, respectively, when T = 0. The γL and γR coefficients specify the rate of change of these choice biases over time. Model parameters were fit through a logistic regression of luminance contrast and reward difference on individual trials, excluding the first 10 trials after each block transition. Therefore, this model studies competitive interactions between contrast and reward at steady state, i.e., after the animal has learned the current reward distribution. The independent variable specified a binary encoding of choice behavior on individual trials, with 1 indicating choice of T1 and 0 indicating choice of T2. All parameters of Equation 2 were fit using the glmfit command in Matlab (MathWorks) using a logit link function. Although the constant term was unconstrained in the GLM, all constants were zero in the fits and therefore are not reported here. Model predictions for a given set of regressor values were obtained using the glmval command in Matlab. This function returns an estimate of pT1, the probability of choosing target 1.
We performed model selection to determine the relationship between all regressors and choice behavior. We fit multiple models using subsets of the coefficients specified by Equation 2 and tested the reduction in deviance between each pair of models using a likelihood-ratio statistic called Akaike's information criterion (AIC) (Akaike, 1974). The AIC estimates the information lost by approximating the true process underlying the data by a particular model (Burnham and Anderson, 1998). For each candidate model, the AIC is computed as where the deviance is the maximized log-likelihood of the model fit and k is the number of parameters. This measure balances the quality of each fit against the increase in model complexity due to the addition of more model parameters (Lau and Glimcher, 2005). The differences in AIC values across models represent the degree of evidence in favor of the best-fitting model, and give a sense of the contribution of each model component when two models differ by inclusion of one parameter. The larger the difference in AIC, the less plausible a model is compared to the best model; values >10 on this scale provide strong support for the model with the smallest AIC value (Burnham and Anderson, 1998). We checked goodness of fit by using the best model to predict mean choice behavior (excluding 10 trials following transitions between reward blocks) as a function of reaction time (see Fig. 6). We chose not to use cross-validation or bootstrapping methods to further test goodness of fit because the AIC already provides a conservative estimate of fit quality and the mean predicted choice behavior was consistent with experimental data using a model with only four parameters.
We also extended the steady-state computational model to test other forms of the dependence between choice behavior, reaction time, luminance, and reward. Specifically, we tested an extension of the steady-state model that adds a (δTT + δLRL × R) term to Equation 2. The δT and δLR coefficients describe the influence of reaction time alone and the multiplicative interaction between luminance contrast and expected reward on choice behavior, respectively. AIC values for these model extensions are presented to test whether these terms significantly improve model performance (Tables 1, 2).
Dynamic choice behavior model.
To investigate the dependence of choice behavior on previous experience, we extended the steady-state choice behavior model in Equation 2 to incorporate the influence of luminance contrast and experienced reward values during previous trials: where the subscript i indexes the value of a coefficient or regressor on trial i with respect to the current trial. RiT1 and RiT2 represent experienced reward on trial i if the subject chose T1 or T2, respectively. The reward regressor associated with the unchosen target was set to zero on each trial. Therefore, unlike the steady-state choice behavior model, the dynamic choice behavior model makes no assumptions about steady-state behavior and can be used to model changes in choice biases both within and across trials.
The reaction time regressor, T0, represents reaction time on the current trial (i = 0) only, and uses the same piecewise-linear encoding for each animal described previously. This resulted in βi ≈ 0 for both animals. We generated variants of Equation 4 using subsets of parameters from the full model, and tested the reduction in deviance between each pair of models using the AIC statistic (Table 2). All model fits presented here reflect p < 0.05 confidence (Student's t test) for all parameter values. Here, we present a 10-trial lag in the dynamic choice model to discover the influence of nonzero regressor coefficients on previous trials while still achieving the desired level of confidence for all fits. We tested models with a lag parameter >10 trials and found that higher-order lag coefficients either were not statistically significant (Student's t test) or, when significant, the coefficients at lags >10 trials were all approximately equal to zero. These results indicate that including >10 trials in the past is not relevant for understanding choice behavior under the circumstances studied here.
The dynamic choice behavior model is more informative than the steady-state choice behavior model about the drivers of choice behavior on each trial as a function of trial lag. However, this advantage occurs at the expense of significantly increased model complexity. The large number of coefficients necessary to model changing choice biases means we cannot test nonlinear interactions between luminance contrast and reward with the limited numbers of trials available. By contrast, we can use the steady-state choice behavior model to study both linear and nonlinear models of selection. For this reason, both steady-state and dynamic choice behavior models are useful for understanding the processes that drive selection.
A simple conceptual model of selection is that “top-down” expected reward and “bottom-up” sensory input channels drive visual selection, which then drives choice behavior (Fig. 1a) (Awh et al., 2006; Knudsen, 2007; Theeuwes, 2010). To understand how these drivers interact, we parametrically varied the strength of each driver by training two monkeys to perform a luminance-reward selection (LRS) task (Fig. 1b, see Materials and Methods) and treating choice behavior as a proxy for visual selection in our analysis. We reasoned that each monkey would show a strong choice bias for one target when reward and luminance differences both favored the selection of the same target (congruent scenarios) (Fig. 1c). However, it was unclear which target would be selected when reward and luminance drove selection of different targets (conflict scenarios).
Figure 2 summarizes behavioral data from two monkeys that performed the LRS task (37,816 trials monkey A; 54,026 trials monkey S). Saccade reaction times for both animals were typically within 100–300 ms (Fig. 2ai,bi; express saccades not shown). Since perceptual deficits caused by exogenously captured attention become weaker over time (Bisley and Goldberg, 2003; Giordano et al., 2009), we analyzed the relationship between luminance contrast [calculated using log10(LT1/LT2)], where Lk represents the luminance of target k, and target choice probability as a function of reaction time.
We partitioned each monkey's reaction time histogram into three categories (fast, medium, and slow) (Fig. 2ai) using boundaries chosen to illustrate the variability of choice behavior over time. These boundaries were different across animals, due to idiosyncratic differences in reaction time distributions and time-dependent selection biases. Then we calculated the marginal probability of choosing a target as a function of luminance contrast for all trials with zero reward difference, and grouped these data according to the three reaction time categories. Figure 2aii shows that luminance contrast strongly influences choice probability when reaction time is fast (<140 ms), but these variables become weakly related when reaction time is slow (>200 ms). We observed the opposite relationship when the same analysis was applied to trials with variable reward and zero luminance contrast (Fig. 2aiii). In these data, reward is weakly related to choice probability when reaction time is fast, but these variables become strongly related when reaction time is slow. These results are qualitatively consistent across reaction time categories for both monkeys (Fig. 2bii–biii). However, when compared over absolute time, these data suggest that the reward sensitivity of monkey S is larger and increases more rapidly after target onset than that of monkey A.
The marginal choice probability profiles shown in Figure 2 suggest that exogenous and endogenous attention evolve over different timescales, with sensory-driven attention dominating early after target onset, and reward-driven attention dominating later in the trial. Given that these processes are supported by distinct mechanisms in the brain (Kastner and Ungerleider, 2000; Corbetta and Shulman, 2002; Giordano et al., 2009), their influence over choice behavior could be additive when reward difference and luminance contrast are varied simultaneously. To test this hypothesis, we developed the steady-state choice behavior model (Eqs. 1, 2, and see Materials and Methods).
Figure 3 presents the steady-state model for four special cases. When α = 1 and all other parameters are set to zero, the model specifies that target choice probability is a function of luminance contrast alone (Fig. 3a). Similarly, when β = −1 and all other parameters are zero, target choice probability is a function of reward difference alone (Fig. 3b). When α = 1 and β = −1, target choice probability depends jointly on luminance contrast and reward difference (Fig. 3c). Finally, Figure 3d shows a model parameterization that qualitatively reproduces the reaction time dependence of L and R implied by the marginal choice curves in Figure 2. This parameterization is described in Equation 5 below: For fast reaction times (T = 0), Equation 5 specifies that target choice probability is a function of luminance contrast only. For intermediate reaction times (T = 0.5), target choice probability depends jointly on luminance contrast and reward difference. For slow reaction times (T = 1), target choice probability is a function of reward difference only.
The model parameterization shown in Equation 5 also makes specific predictions concerning the time evolution of the bias terms, BL and BR, during congruent and conflict scenarios (Fig. 4). During congruent scenarios, when both L and R favor the same target, the net choice bias (BL − BR) favors a single target for all reaction times. During conflict scenarios, when L and R favor different targets, the net choice bias transitions from favoring T1 to T2, or vice versa, at a critical “switching” point, Tswitch.
Figure 5a summarizes T1 choice probability, pT1, for monkey A's behavior during trials when luminance contrast and reward difference were varied simultaneously. Behavioral data are grouped according to the reaction time categories shown in Figure 2ai. In these data, luminance contrast is the primary driver of choice probability when reaction time is fast, luminance contrast and reward difference jointly drive choice probability when reaction time is intermediate, and reward difference is the primary driver of choice probability when reaction time is slow. These behavioral results are consistent across animals (Fig. 5c) when grouped by reaction time category.
In separate analyses, we fit the steady-state choice behavior model to behavioral data from each animal on a single trial basis, excluding the first 10 trials from each reward block (see Materials and Methods). Table 1 presents the AIC values for variants of Equation 2 that include subsets of parameters from the full model. Of all linear models tested, the lowest AIC value occurs for models of the form described in Equation 2. A nonlinear extension to Equation 2 that incorporates an additional δLRLR term did not significantly improve the model fit (Table 1).
The optimal parameterization of Equation 2 for each animal is described in Equations 6 and 7 below: Figure 5, b and d, shows the T1 choice probabilities predicted by these models for both monkeys. As suggested by their low AIC values (Table 1), the models shown in Equations 6 and 7 provide a good quantitative fit to the behavioral data. These fits do not change substantially when express saccades are included (as shown) or excluded (not shown) from the training data.
A prediction of the steady-state choice behavior model is that target choice bias switches from one target to the other over time during conflict scenarios. We tested this prediction by summarizing T1 choice probability for all congruent and conflict scenarios in our behavioral data and plotting this as a function of reaction time (Fig. 6). We then overlaid the T1 choice probability values that are predicted by Equations 6 and 7. The model predictions provide a strong quantitative fit to the behavioral data, consistent with the low AIC values shown in Table 1. Notably, during conflict scenarios, the choice probability transitions between targets at times near Tswitch = 180 ms for monkey A and Tswitch = 137 ms for monkey S, which are predicted by the steady-state choice models.
To relax the assumption of reward-driven behavior at steady-state and therefore study the influence of prior experience on choice bias, we extended the model from Equation 2 to include data from the previous 10 trials (see Materials and Methods, Eq. 4). In this dynamic choice behavior model, T0 encodes reaction time on the current trial, Li encodes luminance contrast on the current trial, and (RiT1 − RiT2) encodes the experienced reward on preceding trial i (see Materials and Methods). All variables are scaled for consistency with the GLM fits shown in Equations 6 and 7. The coefficients αi, βi, γiL, and γiR specify the weighting of the associated parameter values on indexed trial i. Table 2 shows the AIC values for variants of Equation 4 that include subsets of parameters from the full model, similar to the analysis shown in Table 1. We were unable to fit a nonlinear interaction parameter, δiLRLi(RiT1 − RiT2), while maintaining p < 0.05 for all parameters in the model, likely due to sampling limitations. Of all models tested, the lowest AIC value occurs for linear models of the form described in Equation 4.
Figure 7 presents the parameter fits for the dynamic choice behavior model. For monkey A, αi = 2.85 on the current trial, but drops to approximately zero for previous trials (Fig. 7a). γiL exhibits similar behavior. By contrast, −γiR decays monotonically from 1.99 on the previous trial to 0.36 at a 10-trial lag. These results are qualitatively similar for monkey S (Fig. 7b). This confirms that, in the context of the LRS task, the luminance contrast kernel is dominated by the current trial only, whereas the reward difference kernel takes a weighted sum of rewards experienced at a trial lag of at least 10 trials.
In this study, we use a novel LRS task to demonstrate that choice biases derived from top-down and bottom-up processes combine linearly in time to drive selection. We quantify the dependence of these biases on reaction time and prior experience using computational models of steady-state and dynamic choice behavior. Based on this quantification, the models in Equations 6 and 7 predict—and the behavioral data in Figure 6 confirm—that competing sensory and reward-driven processes drive a “switch” in target selection bias at Tswitch ≈ 140–180 ms. Our findings in monkeys agree with and build substantially on the human attentional literature, which has shown that visual selection is completely stimulus driven at timescales <150 ms, while volitional control based on expectancy drives selection at later times (Theeuwes, 2010).
Two features of the LRS paradigm were critical to the recovery of our findings: simultaneous manipulation of top-down and bottom-up attention, and spatial randomization of target locations. Most previous studies of the competitive interaction between top-down and bottom-up processes manipulate attention using separate target and distractor stimuli. Typically, a precue illuminates first, followed by a distractor at some delay, followed by a cue to perform a movement (Reynolds et al., 1999; Bisley and Goldberg, 2003; Giordano et al., 2009; Liu et al., 2009). In such paradigms, spatial attention is allocated to the precue before attention is captured by the distractor, and reward-driven endogenous attention is allocated when the cue appears after the distractor. Therefore, multiple forms of attention are deployed over partially overlapping time intervals. In such cases, the time course of their evolution and interaction is difficult to map without presenting competing stimuli simultaneously and allowing the subject to react immediately, as we do here.
Presenting the two targets at random spatial locations on each trial allowed us to control for the influence of spatial attention on choice behavior. Previous work has shown that spatial biases induced by cueing have a suppressive influence on exogenous attention (Liu et al., 2009) and can improve acuity in the attended area at the expense of unattended areas (Montagna et al., 2009). These data suggest that preexisting spatial attention may bias competition between stimulus- and goal-driven attention. We reasoned that repeated presentation of two targets to a predictable set of locations might lead to similar biases. In principle, randomization should lead to the deployment of reward-driven attention only and provide an unbiased measurement of its competitive interaction with sensory drive.
Reaction time dependence of selection biases
The timescales underlying top-down and bottom-up selection processes have been a major focus of experimental work over the last 20 years. Human behavioral studies using the additional singleton task (Theeuwes, 1992; Kim and Cave, 1999; Theeuwes et al., 2000), in which a distractor singleton is presented at a variable delay before a target singleton, have established that the interference effect of a distractor is present at stimulus onset asynchronies of up to 150 ms before the target singleton. These results in humans are consistent with our observation in monkeys that time-varying stimulus- and reward-driven selection biases can have a balanced influence on behavior at “switching” times that range from 140 to 180 ms. It is interesting that monkey S appears to pursue a reward-maximizing strategy by waiting until well after this transition (mean reaction time = 192 ± 31 vs Tswitch = 137 ms), while monkey A exhibits greater exogenously driven behavior (mean reaction time = 168 ± 30 vs Tswitch = 180 ms).
Although the physiology of top-down and bottom-up competition remains poorly understood, there is experimental support at the single neuron level for “switching” between bottom-up and top-down selection biases at the timescales discussed here. Recordings from isolated V4 neurons in behaving monkeys show that firing rate modulations occur after 175 ms when the animal is looking for a color singleton among an array of targets (Ogawa and Komatsu, 2004), while these modulations do not occur when the monkeys search for a shape singleton. Other studies have demonstrated the dominant influence of salient stimuli on neural activity at times <150 ms in inferior temporal cortex (Chelazzi et al., 1998; Chelazzi et al., 2001), posterior parietal cortex (Constantinidis and Steinmetz, 2005), and lateral intraparietal cortex (LIP) (Buschman and Miller, 2007). These timescales are consistent with the time-varying stimulus- and reward-driven choice biases described here, and the neural activity may therefore reflect attentional “switching” during competition.
Evidence for additive drivers of selection
An extensive literature has investigated the nature of competitive interactions between top-down and bottom-up attention. The biased competition theory of selective attention (Desimone and Duncan, 1995; Desimone, 1998) has been especially influential. One of its three basic principles of control suggests that competition can be biased by reward-driven and stimulus-driven factors. In this framework, competition between systems is integrated, and the target that is selected in the up-stream processors will be biased-for by down-stream processors. Importantly, biased competition implies a joint—and potentially nonlinear—dependence of choice behavior on sensory and goal-directed processes. Here we show that the effects of these processes on choice behavior are additive over the 100–300 ms reaction timescale studied. One interpretation of these findings is that competition between systems may not be integrated over this timescale. Instead, the dissociable influence of luminance and reward biases on selection implies that top-down and bottom-up processes are functionally independent during the LRS task.
The brain is known to combine information about decision variables using a weighted sum in the auditory (Green, 1958) and visual (Young et al., 1993; Kinchla et al., 1995; Landy et al., 1995) systems, and there is recent evidence for this in parietal association cortex (Ipata et al., 2009). Our steady-state model fits demonstrate that a relatively simple linear model is sufficient to reveal the time evolution of choice biases and their influence on behavior at reaction times from 100 to 300 ms. Future work can investigate the deviation of observed choice behavior from model predictions at reaction times longer than 250 ms during congruent scenarios for monkey A, and during conflict scenarios for monkey S.
The behavioral data shown here are consistent with a race between two signals that favor separate targets. The race paradigm has been used previously to model the dynamics of behavior (Logan and Cowan, 1984; Boucher et al., 2007). These models connect with our analysis, in which the luminance and reward signals shown in Equation 1 could serve as inputs to the race signals for each target. Future analysis of our data using a race model may provide greater insight into the reaction time distributions shown in Figure 1.
Changing selection bias as changing utility
One interpretation of the time-dependent selection biases shown here is that the utility of targets changes as a function of reaction time. For the monkeys to make a choice on each trial, they must integrate information regarding the luminance and reward magnitudes of the targets by first converting these values into a “common currency” from which they can be compared (Sugrue et al., 2005; Kable and Glimcher, 2009). Only then can the monkeys form a subjective value for each target and make a choice. Interestingly, the manner in which luminance and reward are converted to this common currency changes over time in the LRS task. Early in a trial, the subjective value of choosing the bright target is highly driven by exogenous attention. As time progresses, however, endogenous attention increases the weighting placed on the high reward magnitude target.
Implications for priority map formation
On a physiological level, our findings are consistent with the formation of two spatial priority maps in the brain that drive the selection process in an additive manner. In area LIP, there is evidence for priority map formation based on visual salience (Gottlieb et al., 1998; Bisley and Goldberg, 2003) and expected reward (Platt and Glimcher, 1999; Bendiksby and Platt, 2006; Rorie et al., 2010). Since LIP receives direct input from dorsal visual areas, it is possible that this area encodes spatial priority based on exogenously captured attention, in addition to expected reward. This interpretation could reasonably be extended to frontal areas such as the frontal eye fields, which share strong reciprocal connections with LIP.
This work was supported, in part, by National Science Foundation (NSF) CAREER Award BCS-0955701, NIH Grant R01-MH087882 as part of the NSF/NIH Collaborative Research in Computational Neuroscience Program, NIH Grant T32 MH19524 (D.A.M.), a Swartz Fellowship in Theoretical Neurobiology (D.A.M.), a Career Award in the Biomedical Sciences from the Burroughs Wellcome Fund (B.P.), a Watson Program Investigator Award from NYSTAR (B.P.), a McKnight Scholar Award (B.P.), and a Sloan Research Fellowship (B.P.).
- Correspondence should be addressed to Dr. Bijan Pesaran, 4 Washington Place, Room 809, New York, NY 10003.