Skip to main content

Main menu

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Collections
    • Podcast
  • ALERTS
  • FOR AUTHORS
    • Information for Authors
    • Fees
    • Journal Clubs
    • eLetters
    • Submit
    • Special Collections
  • EDITORIAL BOARD
    • Editorial Board
    • ECR Advisory Board
    • Journal Staff
  • ABOUT
    • Overview
    • Advertise
    • For the Media
    • Rights and Permissions
    • Privacy Policy
    • Feedback
    • Accessibility
  • SUBSCRIBE

User menu

  • Log out
  • Log in
  • My Cart

Search

  • Advanced search
Journal of Neuroscience
  • Log out
  • Log in
  • My Cart
Journal of Neuroscience

Advanced Search

Submit a Manuscript
  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Collections
    • Podcast
  • ALERTS
  • FOR AUTHORS
    • Information for Authors
    • Fees
    • Journal Clubs
    • eLetters
    • Submit
    • Special Collections
  • EDITORIAL BOARD
    • Editorial Board
    • ECR Advisory Board
    • Journal Staff
  • ABOUT
    • Overview
    • Advertise
    • For the Media
    • Rights and Permissions
    • Privacy Policy
    • Feedback
    • Accessibility
  • SUBSCRIBE
PreviousNext
Journal Club

Neural Mechanisms for Undoing the “Curse of Dimensionality”

Avinash R. Vaidya
Journal of Neuroscience 2 September 2015, 35 (35) 12083-12084; https://doi.org/10.1523/JNEUROSCI.2428-15.2015
Avinash R. Vaidya
Montreal Neurological Institute, Department of Neurology and Neurosurgery, McGill University, Montreal, Quebec, H3A 2B4, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Avinash R. Vaidya
  • Article
  • Info & Metrics
  • eLetters
  • PDF
Loading

Human behavior is marked by a sophisticated ability to attribute outcomes and events to choices and experiences with surprising nuance. Understanding the mechanisms that govern this ability is a major focus for cognitive neuroscience. Reinforcement learning (RL) theory has provided a tractable framework for researching this process and interpreting putative neurophysiological signals underlying learning. The Rescorla–Wagner model in particular (Rescorla and Wagner, 1972), extended into temporal difference learning by Sutton and Barto (1998), has provided an especially rich set of predictions that align nicely with many behavioral and physiological results (Schultz et al., 1997). At the core of these models is the idea that the value of available options is continuously updated by comparing their expected value with feedback after each decision. This comparison yields a prediction error that is used to update expectations and guide future choices. While this model provides an elegant explanation for learning in many simple experimental conditions, it cannot be easily applied to more complex tasks, particularly when options have multiple features, or dimensions that may each have some value. This problem is even more obvious in real-life choices, where options are so multifaceted and multidimensional that a RL process would seem implausible. This problem has been described as the “curse of dimensionality” (Sutton and Barto, 1998).

The problem of learning in multidimensional environments has long vexed animal learning theorists. Early experiments demonstrated that associative learning did not operate equally for all stimuli: Pavlov (1927) showed that conditioning by a salient stimulus could overshadow learning about a concurrent, less-salient stimulus. Related behavioral phenomena prompted learning theorists to develop more sophisticated models that included attentional modules that would adaptively select features for further learning (Mackintosh, 1975; Pearce and Hall, 1980). These models are linked by a common focus on the importance of learning about the predictive or information value of stimulus dimensions. While the predictions of these models have been investigated thoroughly through years of animal experiments (for review, see Le Pelley, 2010), the underlying neurobiological mechanisms of these processes are not well understood, particularly in the human brain.

In a recent article in The Journal of Neuroscience, Niv et al. (2015) investigated the neural correlates of learning in a multidimensional environment, using a combination of computational modeling and functional magnetic resonance imaging (fMRI). They describe a putative computational mechanism for adaptively selecting relevant features for learning and a network of brain regions that may be involved in this process. In their experiment, subjects chose between three compound stimuli that were each defined by three different dimensions (shape, texture, and color). Subjects completed a series of short blocks, in which only one dimension of the stimuli (e.g., shape) was predictive of rewarding feedback, with rewards being probabilistically more likely for one feature (e.g., 75% chance of reward for the triangle), while the other two features in that dimension were associated with a lower likelihood of reward (e.g., 25% chance of reward for the circle or square). Niv et al. (2015) argue that this task can be solved through representation learning, i.e., electing the current state representation (or relevant stimulus dimension) in the task to guide learning.

The authors compared six computational models that fell broadly into three categories: RL models based on the Rescorla–Wagner equation, a statistically optimal Bayesian model, and a serial hypothesis model that selected between candidate stimulus dimensions based on available evidence. The data were best fit by a reinforcement-learning model operating on the level of stimulus features and incorporating a decay parameter that weakened features not chosen in each trial (fRL+decay model). The authors then asked where the hemodynamic response correlated with representation learning, as measured by the standard deviation of feature weights predicted by the dRL+decay model. This analysis identified a bilateral network of regions corresponding to the frontoparietal attention network described by Corbetta and Shulman (2002), including the intraparietal sulcus (IPS) and dorsolateral prefrontal cortex, as well as the right lateral orbitofrontal cortex (OFC).

At first glance, it may seem surprising that the fRL+decay model described in this work provided the best fit to participants' choices. As mentioned earlier, it seems implausible that RL would operate on all available stimulus dimensions in such a task. Indeed, prior work with this task identified serial hypothesis testing as the best explanation of participants' behavior (Wilson and Niv, 2012), yet this model was outdone by fRL+decay here. The decay parameter of the fRL+decay model was critical to its performance. By decaying unchosen feature weights in every trial, the model effectively puts an attentional filter on features not chosen while still giving them access to the decision process. This relatively simple modification of the RL model made a substantial difference: without the decay parameter, a simple feature-based RL model was marginally outperformed by serial hypothesis testing. These results suggest that simple attentional mechanisms operating in the feature space of such a task provide a better explanation of associative learning than serial hypothesis testing or feature-based RL without an attention mechanism.

In all of the models tested by Niv et al. (2015), feature selection occurs during the decision-making stage. However, earlier attentional RL models suggested that attentional weights are applied to the prediction error signal itself. These earlier models also suggested that teaching signals are weighted based on the learned predictive value of a stimulus feature, rather than through decaying the weights of unchosen features (Mackintosh, 1975; Pearce and Hall, 1980). Niv et al. (2015) do not test this possibility, though they do allude to it in the discussion. Formally comparing these models with the fRL+decay model would give some insight into the stage at which attentional processes operate during multidimensional learning.

Niv et al. (2015) do not attempt to distinguish the different contributions of regions identified by their analysis to representation learning. There remain many questions about how RL mechanisms interface with attentional selection to guide behavior. Recent work suggests that the IPS signals the behavioral relevance of options during decision-making (Peck et al., 2009). Hunt et al. (2014) have suggested that this region might select attributes based on behavioral relevance, showing that communication between IPS, lateral OFC, and putamen depended on the respective relevance of stimuli and actions to a current choice. Like Niv et al. (2015), these findings point to a role for IPS and lateral OFC in representation learning and selection among stimulus features.

There has recently been a great deal of interest in the role of OFC in model-based RL. It has been suggested that this region provides a cognitive map that allows representation of the underlying reinforcement contingencies of a task (Wilson et al., 2014; Stalnaker et al., 2015). The experiment in Niv et al. (2015) similarly requires adaptively learning which stimulus features are currently relevant, demanding that subjects quickly learn the hidden state of reinforcement in each game. OFC might operate together with lateral prefrontal cortex and IPS to adaptively attend to relevant stimulus features based on a model of the reinforcement contingencies in the task. Future work, possibly using techniques with a faster time resolution and better signal quality in OFC will be required to understand how this network of regions operates together to guide representation learning.

The results of Niv et al. (2015) provide an important step forward in understanding the neurobiological and computational mechanisms underlying representation learning. The authors demonstrate that some straightforward modifications of the basic RL model can greatly improve its predictive power in a fairly complex task. More broadly, these results point a way forward for using the relatively simple mechanics of RL for generating tractable computational hypotheses for learning in complex environments. This line of work might prove useful as researchers move toward grappling with understanding the neurobiology of behavior in more ecologically valid settings.

Footnotes

  • Editor's Note: These short, critical reviews of recent papers in the Journal, written exclusively by graduate students or postdoctoral fellows, are intended to summarize the important findings of the paper and provide additional insight and commentary. For more information on the format and purpose of the Journal Club, please see http://www.jneurosci.org/misc/ifa_features.shtml.

  • This work is supported by a CIHR operating grant (MOP 97821), and a Desjardins Outstanding Student Award to A.R.V.

  • The author declares no competing financial interests.

  • Correspondence should be addressed to Avinash R. Vaidya, Montreal Neurological Institute, McGill University, 3801 University Street, Room 276, Montreal, QC, H3A 2B4, Canada. avinash.vaidya{at}mail.mcgill.ca

References

  1. ↵
    1. Corbetta M,
    2. Shulman GL
    (2002) Control of goal-directed and stimulus-driven attention in the brain. Nat Rev Neurosci 3:201–215, doi:10.1038/nrn755, pmid:11994752.
    OpenUrlCrossRefPubMed
  2. ↵
    1. Hunt LT,
    2. Dolan RJ,
    3. Behrens TE
    (2014) Hierarchical competitions subserving multi-attribute choice. Nat Neurosci 17:1613–1622, doi:10.1038/nn.3836, pmid:25306549.
    OpenUrlCrossRefPubMed
  3. ↵
    1. Le Pelley ME
    (2010) in Attention and associative learning: from brain to behaviour, Attention and human associative learning, eds Mitchell CJ, Le Pelley ME (Oxford UP, Oxford).
  4. ↵
    1. Mackintosh NJ
    (1975) Theory of attention-variations in associability of stimuli with reinforcement. Psychol Rev 82:276–298, doi:10.1037/h0076778.
    OpenUrlCrossRef
  5. ↵
    1. Niv Y,
    2. Daniel R,
    3. Geana A,
    4. Gershman SJ,
    5. Leong YC,
    6. Radulescu A,
    7. Wilson RC
    (2015) Reinforcement learning in multidimensional environments relies on attention mechanisms. J Neurosci 35:8145–8157, doi:10.1523/JNEUROSCI.2978-14.2015, pmid:26019331.
    OpenUrlAbstract/FREE Full Text
  6. ↵
    1. Pavlov IP
    (1927) Conditioned reflexes (Oxford UP, London).
  7. ↵
    1. Pearce JM,
    2. Hall G
    (1980) A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli. Psychol Rev 87:532–552, doi:10.1037/0033-295X.87.6.532, pmid:7443916.
    OpenUrlCrossRefPubMed
  8. ↵
    1. Peck CJ,
    2. Jangraw DC,
    3. Suzuki M,
    4. Efem R,
    5. Gottlieb J
    (2009) Reward modulates attention independently of action value in posterior parietal cortex. J Neurosci 29:11182–11191, doi:10.1523/JNEUROSCI.1929-09.2009, pmid:19741125.
    OpenUrlAbstract/FREE Full Text
  9. ↵
    1. Rescorla RA,
    2. Wagner AR
    (1972) in Classical conditioning. II. Current research and theory, A theory of Pavlovian conditioning: variations in the effective-ness of reinforcement and non-reinforcement, eds Black AH, Prokasy WF (Appleton-Century-Crofts, New York), pp 64–99.
  10. ↵
    1. Schultz W,
    2. Dayan P,
    3. Montague PR
    (1997) A neural substrate of prediction and reward. Science 275:1593–1599, doi:10.1126/science.275.5306.1593, pmid:9054347.
    OpenUrlAbstract/FREE Full Text
  11. ↵
    1. Stalnaker TA,
    2. Cooch NK,
    3. Schoenbaum G
    (2015) What the orbitofrontal cortex does not do. Nat Neurosci 18:620–627, doi:10.1038/nn.3982, pmid:25919962.
    OpenUrlCrossRefPubMed
  12. ↵
    1. Sutton RS,
    2. Barto AG
    (1998) Reinforcement learning: an introduction (MIT, Cambridge, MA).
  13. ↵
    1. Wilson RC,
    2. Niv Y
    (2012) Inferring relevance in a changing world. Front Hum Neurosci 5:189, doi:10.3389/fnhum.2011.00189, pmid:22291631.
    OpenUrlCrossRefPubMed
  14. ↵
    1. Wilson RC,
    2. Takahashi YK,
    3. Schoenbaum G,
    4. Niv Y
    (2014) Orbitofrontal cortex as a cognitive map of task space. Neuron 81:267–279, doi:10.1016/j.neuron.2013.11.005, pmid:24462094.
    OpenUrlCrossRefPubMed
Back to top

In this issue

The Journal of Neuroscience: 35 (35)
Journal of Neuroscience
Vol. 35, Issue 35
2 Sep 2015
  • Table of Contents
  • Table of Contents (PDF)
  • About the Cover
  • Index by author
  • Advertising (PDF)
  • Ed Board (PDF)
Email

Thank you for sharing this Journal of Neuroscience article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
Neural Mechanisms for Undoing the “Curse of Dimensionality”
(Your Name) has forwarded a page to you from Journal of Neuroscience
(Your Name) thought you would be interested in this article in Journal of Neuroscience.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Print
View Full Page PDF
Citation Tools
Neural Mechanisms for Undoing the “Curse of Dimensionality”
Avinash R. Vaidya
Journal of Neuroscience 2 September 2015, 35 (35) 12083-12084; DOI: 10.1523/JNEUROSCI.2428-15.2015

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Respond to this article
Request Permissions
Share
Neural Mechanisms for Undoing the “Curse of Dimensionality”
Avinash R. Vaidya
Journal of Neuroscience 2 September 2015, 35 (35) 12083-12084; DOI: 10.1523/JNEUROSCI.2428-15.2015
Twitter logo Facebook logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • Footnotes
    • References
  • Info & Metrics
  • eLetters
  • PDF

Responses to this article

Respond to this article

Jump to comment:

No eLetters have been published for this article.

Related Articles

Cited By...

More in this TOC Section

  • Attentional Mechanisms for Learning Feature Combinations
  • Moment-to-Moment Heart–Brain Interactions: How Cardiac Signals Influence Cortical Processing and Time Estimation
  • How the Ventromedial Prefrontal Cortex (VMPFC) Facilitates Welfare Maximization in Social Contexts
Show more Journal Club
  • Home
  • Alerts
  • Follow SFN on BlueSky
  • Visit Society for Neuroscience on Facebook
  • Follow Society for Neuroscience on Twitter
  • Follow Society for Neuroscience on LinkedIn
  • Visit Society for Neuroscience on Youtube
  • Follow our RSS feeds

Content

  • Early Release
  • Current Issue
  • Issue Archive
  • Collections

Information

  • For Authors
  • For Advertisers
  • For the Media
  • For Subscribers

About

  • About the Journal
  • Editorial Board
  • Privacy Notice
  • Contact
  • Accessibility
(JNeurosci logo)
(SfN logo)

Copyright © 2025 by the Society for Neuroscience.
JNeurosci Online ISSN: 1529-2401

The ideas and opinions expressed in JNeurosci do not necessarily reflect those of SfN or the JNeurosci Editorial Board. Publication of an advertisement or other product mention in JNeurosci should not be construed as an endorsement of the manufacturer’s claims. SfN does not assume any responsibility for any injury and/or damage to persons or property arising from or related to any use of any material contained in JNeurosci.