The Journal of Neuroscience, April 23, 2008, 28(17):4356-4367; doi:10.1523/JNEUROSCI.0647-08.2008
Previous Article | Next Article 
Behavioral/Systems/Cognitive
Learning Stochastic Reward Distributions in a Speeded Pointing Task
Anna Seydell,1
Brian C. McCann,2
Julia Trommershäuser,1 and
David C. Knill3
1Department of Psychology, University of Giessen, 35394 Giessen, Germany, 2Center for Perceptual Systems, The University of Texas, Austin, Texas 78712-0187, and 3Center for Visual Science, University of Rochester, Rochester, New York 14627-0270
Correspondence should be addressed to Dr. Anna Seydell, Department of Psychology, University of Giessen, Otto-Behaghel-Strasse 10F, 35394 Giessen, Germany. Email: anna.seydell{at}psychol.uni-giessen.de
Recent studies have shown that humans effectively take into account task variance caused by intrinsic motor noise when planning fast hand movements. However, previous evidence suggests that humans have greater difficulty accounting for arbitrary forms of stochasticity in their environment, both in economic decision making and sensorimotor tasks. We hypothesized that humans can learn to optimize movement strategies when environmental randomness can be experienced and thus implicitly learned over several trials, especially if it mimics the kinds of randomness for which subjects might have generative models. We tested the hypothesis using a task in which subjects had to rapidly point at a target region partly covered by three stochastic penalty regions introduced as "defenders." At movement completion, each defender jumped to a new position drawn randomly from fixed probability distributions. Subjects earned points when they hit the target, unblocked by a defender, and lost points otherwise. Results indicate that after
600 trials, subjects approached optimal behavior. We further tested whether subjects simply learned a set of stimulus-contingent motor plans or the statistics of defenders' movements by training subjects with one penalty distribution and then testing them on a new penalty distribution. Subjects immediately changed their strategy to achieve the same average reward as subjects who had trained with the second penalty distribution. These results indicate that subjects learned the parameters of the defenders' jump distributions and used this knowledge to optimally plan their hand movements under conditions involving stochastic rewards and penalties.
Key words: visuomotor control; movement planning; decision making; implicit learning; environmental statistics; maximizing expected gain
Received Nov. 6, 2007;
accepted March 17, 2008.
Correspondence should be addressed to Dr. Anna Seydell, Department of Psychology, University of Giessen, Otto-Behaghel-Strasse 10F, 35394 Giessen, Germany. Email: anna.seydell{at}psychol.uni-giessen.de