Opinion
A normative perspective on motivation

https://doi.org/10.1016/j.tics.2006.06.010Get rights and content

Understanding the effects of motivation on instrumental action selection, and specifically on its two main forms, goal-directed and habitual control, is fundamental to the study of decision making. Motivational states have been shown to ‘direct’ goal-directed behavior rather straightforwardly towards more valuable outcomes. However, how motivational states can influence outcome-insensitive habitual behavior is more mysterious. We adopt a normative perspective, assuming that animals seek to maximize the utilities they achieve, and viewing motivation as a mapping from outcomes to utilities. We suggest that habitual action selection can direct responding properly only in motivational states which pertained during behavioral training. However, in novel states, we propose that outcome-independent, global effects of the utilities can ’energize’ habitual actions.

Introduction

Motivation occupies center stage in the psychology and behavioral neuroscience of decision making, and specifically instrumental action selection. There has been a recent renaissance in sophisticated analyses of motivation, primarily because manipulations such as specific satiety or motivational shifts have been used to tease apart different types of instrumental behaviors, namely, ‘goal-directed’ and ‘habitual’ control. These suggest that goal-directed and habitual actions are distinguished by the former's, but not the latter's, sensitivity to the utility of their specific outcomes [1]. Although goal-directed and habitual behavior can be characterized by their differing motivational sensitivities, and the effects of motivational manipulations on goal-directed behavior are relatively clear, exactly how (and indeed, whether) motivation influences habitual responding has remained unresolved. This is particularly disturbing as habitual responding plays a very prominent part in both normal and abnormal behavior.

That our understanding of motivational control is lacking might be partly because motivation itself is not a unitary construct [2]. In fact, Dickinson and Balleine [1] trace back to Descartes two very distinct influences of motivation on behavior: a ‘directing’ effect, determining the current goal(s) of behavior (e.g. food or water), and an ‘energizing’ effect, which determines the force or vigor underlying those actions. The latter is closely linked to Hullian ‘generalized drive’ 3, 4, 5, a motivational process that serves to energize all pre-potent actions. Whereas much is known about the directing aspects of motivation, the ‘energizing’ effects of generalized drive have remained highly controversial.

Here, we confront this challenge. We start by suggesting a simple, normative notion of motivation that allows us to define precisely outcome-specific ‘directing’ effects and outcome-independent ‘energizing’ effects. We then suggest that the outcome-specific effects of a novel motivational state predominantly influence goal-directed behavior, whereas the ‘energizing’ effects of generalized drive are seen in habitual responding [6]. As only preliminary experimental results on the latter hypothesis exist, we describe how it can best be tested, and detail its implications for both the understanding of motivational control and the resolution of the age-old debate regarding the existence of generalized drive.

Section snippets

Motivation: a mapping from outcomes to utilities

Our conception of motivation is strongly influenced by the field of reinforcement learning [7]. In reinforcement learning, outcomes such as food or water have numerical utilities, and the imperative is to choose actions to maximize a long-term measure of total utility. However, in different motivational states, outcomes may have different utilities. We therefore define motivation as the mapping between outcomes and their utilities, and refer to ‘motivational states’ (e.g. ‘hunger’ or ‘thirst’)

Goal-directed behavior: a ‘brute force’ solution

Almost by definition, the goal-directed system uses what is called a ‘forward model’, working out the ultimate outcomes consequent on a sequence of actions by searching through the tree of state-actions-consequences, and choosing actions based on the outcomes’ current utilities (Figure 1b) [10]. Specific satiety and conditioned taste-aversion procedures (Box 2) have shown that action choice in this system is sensitive to manipulations that alter outcome utilities 14, 15, 16, 17, 18, 19, 20, 21.

Is habitual behavior doomed to be motivation-insensitive?

Normative computational models of habitual action selection view it as arising from stored (cached) values of different actions in different states (Figure 1c). Each value is defined in terms of the expected cumulative future utilities consequent on performing this action in this state. Adding together the utilities of different outcomes (food, drink, mates, etc.), cached values are thus outcome-general and defined in units of a common currency. The values are acquired through extensive

Two sides of motivational influence: the directing and the energizing

In summary, a normative analysis of the different revaluation manipulations used to establish the characteristics of habitual and goal-directed behavior suggests that the outcome-specific ‘directing’ effects of a novel motivational state influence goal-directed behavior, whereas the ‘energizing’ effects of generalized drive are seen in habitual responding. This distinction also calls for the operational definition of habitual behavior to be slightly refined. Habits are not in general

Conclusions

Motivation turns out to be a rich and complex topic, because it has multiple facets to which the various action-selection systems are differentially sensitive. Oddly, it has been easier to use motivation to dissociate these systems than it has been to use them to elucidate motivation. Our definition of motivational states in terms of mappings between outcomes and utilities provides a simple normative scaffold on which to understand both optimal and approximately optimal sensitivity to outcome

Acknowledgements

We are grateful to Misha Ahrens, Bernard Balleine, Nathaniel Daw, Máté Lengyel, Ken Norman, Tom Schonberg, Ina Weiner and Louise Whiteley for helpful comments on earlier versions of the manuscript, and to Nathaniel Daw for much discussion and sharing of ideas. Our gratitude goes to Sharon Riwkes and Eran Katz who carried out some of the experiments on motivational control of habitual behavior. This research was funded by a Dan David fellowship and a Hebrew University Rector Fellowship to Y.N.,

References (50)

  • R. Bolles

    Theory of Motivation

    (1967)
  • Y. Niv

    How fast to work: Response vigor, motivation and tonic dopamine

  • R.S. Sutton et al.

    Reinforcement Learning

    (1998)
  • A. Dickinson

    Actions and habits: The development of behavioral autonomy

    Philos. Trans. R. Soc. Lond. B Biol. Sci.

    (1985)
  • N.D. Daw

    Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control

    Nat. Neurosci.

    (2005)
  • A. Dickinson et al.

    Motivational control of goal-directed action

    Anim. Learn. Behav.

    (1994)
  • C.J.C.H. Watkins et al.

    Q-learning

    Mach. Learn.

    (1992)
  • J. O’Doherty

    Dissociable roles of ventral and dorsal striatum in instrumental conditioning

    Science

    (2004)
  • B.W. Balleine et al.

    The effect of lesions of the insular cortex on instrumental conditioning: Evidence for a role in incentive memory

    J. Neurosci.

    (2000)
  • L.H. Corbit

    The role of nucleus accumbens in instrumental conditioning: Evidence of a functional dissociation between accumbens core and shell

    J. Neurosci.

    (2001)
  • S. Killcross et al.

    Coordination of actions and habits in the medial prefrontal cortex of rats

    Cereb. Cortex

    (2003)
  • P. Holland

    Relations between Pavlovian-instrumental transfer and reinforcer devaluation

    J. Exp. Psychol. Anim. Behav. Process.

    (2004)
  • Yin, H.H. et al. (2005a) Blockade of NMDA receptors in the dorsomedial striatum prevents action-outcome learning in...
  • Yin, H.H. et al. (2005b) The role of dorsomedial striatum in instrumental conditioning. Eur. J. Neurosci. 22,...
  • A. Dickinson et al.

    Incentive learning and the motivational control of instrumental performance

    Q. J. Exp. Psychol.

    (1989)
  • Cited by (197)

    • Emotions as computations

      2023, Neuroscience and Biobehavioral Reviews
    View all citing articles on Scopus
    View full text