2006 Special IssueNeural systems implicated in delayed and probabilistic reinforcement
Section snippets
Delayed and uncertain reinforcement: The problems of learning and choice
Natural and artificial learning agents must grapple with the problem of selecting actions to achieve the best possible outcome under their value system. However, the outcome of a given action is not always certain and immediate. Outcomes are frequently uncertain: agents do not always obtain that for which they work. Furthermore, when an agent acts to obtain reward or reinforcement, there is often a delay between its action and the ultimate outcome. This applies both to positive reinforcers
Individual differences: Risk taking and impulsivity
Individual differences in responsivity to uncertain or delayed reinforcement are also of considerable interest. When making decisions under conditions of uncertainty, individuals vary as to how much uncertainty or risk they are willing to tolerate. Formally, individuals differ in how much they discount the value of reinforcers as the uncertainty of the reinforcer increases (i.e. as the probability of the reinforcer declines, or the odds against obtaining the reinforcer increase) (Ho, Mobini,
Learning with delayed reinforcement in normal animals
Delays can hamper both Pavlovian and instrumental conditioning (Dickinson, 1980, Dickinson, 1994, Gallistel, 1994, Hall, 1994, Mackintosh, 1983): for example, instrumental conditioning has long been observed to be systematically impaired as the outcome is delayed (Dickinson et al., 1992, Grice, 1948, Harker, 1956, Lattal and Gleeson, 1990, Perin, 1943, Skinner, 1938). Despite this, normal rats have been shown to acquire free-operant responding with programmed response–reinforcer delays of up to
Delayed and probabilistic reinforcement: Equivalent or distinct processes?
It has been suggested that delay (or temporal) discounting, the process by which delayed reinforcers lose value, and probability (or odds) discounting, the process by which uncertain reinforcers lose value, reflect the same underlying process (Green and Myerson, 1996, Mischel, 1966, Mazur, 1989, Mazur, 1995, Mazur, 1997, Rachlin et al., 1987, Rachlin et al., 1986, Rachlin et al., 1991, Rotter, 1954, Sozou, 1998, Stevenson, 1986). For example, choosing an uncertain reinforcer five times but only
Systemic pharmacological studies
Given the importance of impulsive choice in disorders such as addiction (Bickel et al., 1999, Evenden, 1999a, Heyman, 1996, Mitchell, 1999, Poulos et al., 1995) and attention-deficit/hyperactivity disorder (ADHD) (Sagvolden et al., 1998, Sagvolden and Sergeant, 1998), a number of groups have studied the effects on impulsive choice of manipulating neurochemical and neuroanatomical systems implicated in these disorders. I will review pharmacological and neurochemical studies first. To date, more
Neuroanatomically specific studies
In recent years, a number of studies have examined the effects of focal excitotoxic or neurochemical lesions on choice and learning involving delayed or uncertain rewards, in additional to correlational studies using functional imaging, microdialysis, and electrophysiology. These studies centre on interconnected structures forming part of the limbic corticostriatal loop (Fig. 2).
Conclusions
A number of limbic corticostriatal structures, together with the major forebrain neuromodulatory systems, play a role in learning and choice involving delayed and probabilistic rewards. The contribution of these structures is best understood for delayed reward (Fig. 2), although recent functional imaging and lesion studies have examined the neuroanatomical basis of choice involving uncertain reward.
To summarize, many structures have been implicated in the processing of delayed and/or
Acknowledgements
This work was supported by the UK Medical Research Council (MRC) and the Wellcome Trust within the University of Cambridge Behavioural and Clinical Neuroscience Institute. I thank two anonymous referees for their helpful comments.
References (373)
- et al.
Functional architecture of basal ganglia circuits: Neural substrates of parallel processing
Trends in Neurosciences
(1990) - et al.
Changes of cerebrospinal fluid monoamine metabolites during long-term antidepressant treatment
European Neuropsychopharmacology
(2000) - et al.
Effects of ibotenic acid lesions of the nucleus accumbens on instrumental action
Behavioural Brain Research
(1994) - et al.
Goal-directed instrumental action: Contingency and incentive learning and their cortical substrates
Neuropharmacology
(1998) - et al.
Differential responsiveness of dopamine transmission to food-stimuli in nucleus accumbens shell/core compartments
Neuroscience
(1999) - et al.
Insensitivity to future consequences following damage to human prefrontal cortex
Cognition
(1994) Measuring hedonic impact in animals and infants: Microstructure of affective taste reactivity patterns
Neuroscience and Biobehavioral Reviews
(2000)- et al.
Rapid depletion of serum tryptophan, brain tryptophan, serotonin and 5-hydroxyindoleacetic acid by a trytophan-free diet
Life Sciences
(1974) - et al.
Functional imaging of neural responses to expectancy and experience of monetary gains and losses
Neuron
(2001) - et al.
Effects of dorsal and ventral striatal lesions on delayed matching trained with retractable levers
Behavioural Brain Research
(2001)