The neural mechanisms underlying decision making have often been probed by asking subjects to choose between movements (e.g., making a saccade to a left or right target) (Sugrue et al., 2005). In studying decisions, the influence of so-called “top-down” factors such as reward magnitude, reward probability, and motivation are of fundamental interest and importance. Unfortunately, these parameters are much more difficult to quantify and modulate empirically than are “bottom-up” factors such as contrast and luminance. In their recent article in The Journal of Neuroscience, Milstein and Dorris (2007) studied the effect of expected value (the product of reward magnitude and probability) on saccadic control. This work builds on the saccadic decision literature, the vast majority of which is based on investigations on nonhuman primate brains (Schall, 2001; Ikeda, 2003).
Milstein and Dorris (2007) instructed human participants to fixate a central spot and then make a saccadic eye movement to a red target which appeared after a 400 ms blank screen (Fig. 1) [Milstein and Dorris (2007), their Fig. 1 (http://www.jneurosci.org/cgi/content/full/27/18/4810/F1)]. A monetary reward for a correct saccade was then displayed. The magnitude of the reward was dependent on the side of the target and varied across blocks of trials. The probability of the target appearing on the left or right was fixed for each block but also varied independently across blocks. The aim was to investigate whether saccadic preparation toward a particular target was influenced by the expected value of that location.
Saccadic reaction time (SRT) was measured as a conventional index of preparation. In addition, the authors used a relatively novel probe to interrogate saccadic preparation before target presentation, by presenting a distractor halfway through the warning period, after 200 ms (Fig. 1A) (30% of trials). These were green, rather than red, but were otherwise identical to real targets and served to trigger erroneous saccades termed “oculomotor capture.” The underlying assumption was that a certain amount of saccadic preparedness must already have been reached if distractors were able to bring the planning circuitry to saccade threshold (Theeuwes, 1999). No reward was given on trials in which oculomotor capture occurred, so there was an incentive to avoid saccades to distractors.
The authors demonstrated a remarkably clear effect on SRT. A significant negative correlation between expected value and reaction time was found. The greater the reward associated with a particular movement was, the shorter the SRT was. Impressively, the distractor trials also demonstrated a significant bias of oculomotor captures toward the side of higher expected value. However, there may be several causes for this bias. One confound which could explain the effect, on both targets and distractors, is that of attention. If attention toward a location of higher expected value is increased, then a lower threshold (and shorter SRT) might occur. The authors concede that their task is not suitable for teasing apart saccadic preparation and visuospatial attention mechanisms. It therefore remains to be determined whether the effect of value here is primarily on the motor system.
Another question regarding the effects generated by distractors is the nature of the bottom-up effect of color. The authors did not examine how reliably subjects distinguished between red targets and green distractors at the retinal eccentricity used on this task. Moreover, the novelty or salience of the green distractor may influence attentional bias and thus interact with value. Clearly, more investigation is required to delineate the relative contributions of prior expected value versus the allocation of spatial attention to the preparation of motor commands. This could be examined using same-colored distractors so that the discrimination was temporal and/or spatial only, rather than determined by a combination of cues.
In these experiments, dual-target trials (Fig. 1B) (10%) were used to assess motivation because the authors were concerned that a delayed financial reward might not sufficiently influence saccadic preparation. Milstein and Dorris (2007) found a choice bias toward the high-magnitude target, confirming the salience of reward magnitude as a variable. These trials may have additional significance, discussed later.
The second experiment investigates the spatial representation of expected value. An array of distractor locations (Fig. 2A) was used to map the probability of oculomotor capture as a function of distance from the target site (Fig. 2B) [Milstein and Dorris (2007), their Figs. 1 (http://www.jneurosci.org/cgi/content/full/27/18/4810/F1), 3 (http://www.jneurosci.org/cgi/content/full/27/18/4810/F3), 6 (http://www.jneurosci.org/cgi/content/full/27/18/4810/F6), 8 (http://www.jneurosci.org/cgi/content/full/27/18/4810/F8)]. The data from these saccades was used to create heat or contour maps, which the authors suggest might represent the actual oculomotor/retinotopic mapping of expected value, comprising a dynamic neural field. Their inference here is that behavioral output can be mapped directly to a neuronal representation of value in space (Fig. 2C). The authors conclude that expected value is an important factor in saccadic preparation. However, the conditions under which expected value, rather than direction probability or reward value asymmetry, determines saccade preparation are still unclear. Interestingly, oculomotor capture depended on probability of a target but not reward magnitude asymmetry. On dual target trials, however, choice bias depended on the reward magnitude but not on the probability asymmetries. In contrast, SRTs to standard targets were influenced by both parameters and were therefore compatible with the expected value hypothesis. The results are difficult to explain without two independent representations for reward magnitude and probability. The authors reasonably infer that the competition in dual-target trials reduces the effect of probability, and that value may have its effect mainly by adjusting the gain of the visual inputs.
If this is the case, we would expect that when reward direction conflicts with probability, the dual-target choice SRT would be prolonged for the more common saccades to the rare but rewarded side, because this involves canceling a saccadic program compared with less frequent saccades to the unrewarded but more probable side.
An alternative explanation is that during the early stage of saccadic preparation, oculomotor maps contain only information about the probability of a target at each location (prior probability). The reward magnitudes are then incorporated gradually during the preparatory period. Accordingly, Ding and Hikosaka (2007) have very recently found that the effect of reward accumulates gradually during the course of a single trial's “foreperiod” (the time between fixation and target presentation). It is an empirical question as to whether oculomotor capture also exhibits sensitivity to reward later on in the foreperiod. If it does, this model would also explain why capture by distractors 200 ms before the expected target onset was less affected by reward magnitude than was the target SRT or the dual-target bias. It could also explain the fact that SRT (determined after reward magnitude has been incorporated during the preparatory period) depends on both probability and value magnitude. A separate competitive process is needed to explain why choice between two simultaneous targets depends on their reward values rather than their prior probabilities.
In summary, Milstein and Dorris (2007) have demonstrated a clear effect of expected reward value on the latency of responses to visual targets in humans, building on previous studies in nonhuman primates (Ikeda, 2003). The interesting measure of preparation revealed through the oculomotor capture probe immediately suggests several possible mechanisms, including the dynamic neural field model that the authors suggest. It also raises many interesting questions regarding the linkage between reward, action control, and attention. More work in humans and animal models will ultimately demonstrate which model best represents how value influences the programming of saccades.
Editor's Note: These short reviews of a recent paper in the Journal, written exclusively by graduate students or postdoctoral fellows, are intended to mimic the journal clubs that exist in your own departments or institutions. For more information on the format and purpose of the Journal Club, please see http://www.jneurosci.org/misc/ifa_features.shtml.
We thank David Milstein and Michael Dorris for generously supplying their original data.
- Correspondence should be addressed to Robert J. Adam, Institute of Cognitive Neuroscience, Institute of Neurology, University College London, Alexandra House, 17 Queen Square, London WC1N 3AR, UK.