Do learning rates adapt to the distribution of rewards?

Samuel J Gershman

doi:10.3758/s13423-014-0790-3

Do learning rates adapt to the distribution of rewards?

Psychon Bull Rev. 2015 Oct;22(5):1320-7. doi: 10.3758/s13423-014-0790-3.

Author

Samuel J Gershman¹

Affiliation

¹ Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, 77 Massachusetts Ave., Room 46-4053, Cambridge, MA, 02139, USA. sjgershm@mit.edu.

PMID: 25582684
DOI: 10.3758/s13423-014-0790-3

Abstract

Studies of reinforcement learning have shown that humans learn differently in response to positive and negative reward prediction errors, a phenomenon that can be captured computationally by positing asymmetric learning rates. This asymmetry, motivated by neurobiological and cognitive considerations, has been invoked to explain learning differences across the lifespan as well as a range of psychiatric disorders. Recent theoretical work, motivated by normative considerations, has hypothesized that the learning rate asymmetry should be modulated by the distribution of rewards across the available options. In particular, the learning rate for negative prediction errors should be higher than the learning rate for positive prediction errors when the average reward rate is high, and this relationship should reverse when the reward rate is low. We tested this hypothesis in a series of experiments. Contrary to the theoretical predictions, we found that the asymmetry was largely insensitive to the average reward rate; instead, the dominant pattern was a higher learning rate for negative than for positive prediction errors, possibly reflecting risk aversion.

Keywords: Decision-making; Multi-armed bandit; Reinforcement learning.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Adult
Female
Humans
Learning*
Male
Motivation*
Probability Learning
Reinforcement, Psychology*
Reward*
Risk-Taking