Skip to main content

Main menu

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Collections
    • Podcast
  • ALERTS
  • FOR AUTHORS
    • Information for Authors
    • Fees
    • Journal Clubs
    • eLetters
    • Submit
  • EDITORIAL BOARD
  • ABOUT
    • Overview
    • Advertise
    • For the Media
    • Rights and Permissions
    • Privacy Policy
    • Feedback
  • SUBSCRIBE

User menu

  • Log in
  • My Cart

Search

  • Advanced search
Journal of Neuroscience
  • Log in
  • My Cart
Journal of Neuroscience

Advanced Search

Submit a Manuscript
  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Collections
    • Podcast
  • ALERTS
  • FOR AUTHORS
    • Information for Authors
    • Fees
    • Journal Clubs
    • eLetters
    • Submit
  • EDITORIAL BOARD
  • ABOUT
    • Overview
    • Advertise
    • For the Media
    • Rights and Permissions
    • Privacy Policy
    • Feedback
  • SUBSCRIBE
PreviousNext
Articles, Behavioral/Cognitive

Emergent Exploration via Novelty Management

Goren Gordon, Ehud Fonio and Ehud Ahissar
Journal of Neuroscience 17 September 2014, 34 (38) 12646-12661; DOI: https://doi.org/10.1523/JNEUROSCI.1872-14.2014
Goren Gordon
1Departments of Neurobiology and
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ehud Fonio
2Physics of Complex Systems, Weizmann Institute of Science, Rehovot 76100, Israel
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ehud Ahissar
1Departments of Neurobiology and
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Article
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF
Loading

Article Figures & Data

Figures

  • Figure 1.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 1.

    A schematic diagram of the model (adapted from Gordon et al., 2014) extending to arena exploration and across both sensory systems. a, Basic curiosity loop. The agent actively perceives the world through its sensors and learns to predict the next state of its sensors from the current state and the action performed by the actor. Novelty, measured as information gain, is the intrinsic reward for an AC module that implements temporal difference error (dashed red arrow) reinforcement learning. b, A model of an exploring rodent that moves its whiskers to perceive walls and moves its body to perceive an arena. The whiskers modality is composed of two loops, whereas the locomotion modality is composed of four loops. c, A hierarchical model of an active perceptual modality that contains n AC modules and one retreat primitive. At any time, only a single loop is closed (dark arrows); if at any time novelty is higher than the average of the active module, J(n), the retreat primitive is activated (red arrows); if novelty is lower than the average for the duration of the active loop, T(n), the next loop is activated (blue arrows).

  • Figure 2.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 2.

    Model implementation on the whisker system across different timescales. a, Adapted from Gordon et al. (2014, their Fig. 5). Developmental convergence dynamics of actors, presented as protraction probability as a function of normalized time (αt, where α is the learning rate), averaged over 10 runs for σ = 0.5, pobj = 1.0. The protraction probability of the first actor (blue line) does not depend on contact information; the protraction probabilities of the second actor depend on whisking (solid red line), contact (dashed red line), detach (dotted red line), or pressure (dotted dashed red line) inputs. a, Inset, Logarithm (base 10) of normalized convergence time of the second AC module, as a function of σ and pobj. x marks parameters for a–c. b, Adapted from Gordon et al. (2014, their Fig. 7). Exploratory episode behavior of the entire converged model; whisker angle is depicted as a function of time, where color denotes the active actor. Magenta horizontal lines denote the angular position of an object. B1, actor 1 protracts the whisker and the retreat primitive retracts the whisker whenever a new angle is reached. B2, initially there are no objects in the whisker field and it protracts, whereupon experiencing no novelty, the NMU switches to the retreat policy (retraction). When objects are present, the initial contact is novel and immediately followed by retreat (B3), whereas the following contacts slowly exhibit the full dynamics of the converged actor 2 (B4). B5, when an object is removed from the whisker field, retreat follows high novelty due to false prediction of its location. c, Perceptual cycle of object location (b, enlarged box): protraction upon contact (magenta diamond), retraction upon pressure (cyan circle), and either retraction (t = 426) or protraction (data not shown) upon detach (yellow square) mechanoreceptor activation.

  • Figure 3.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 3.

    Model implementation on locomotive exploration of a novel circular arena (Fonio et al., 2009). a, Convergence dynamics of actors of the four loops (pl = 1,2,3,4), where pcorner denotes the probability to stay in corners, pwall denotes the probability to follow walls, and popen denotes the probability to avoid walls and seek open space (σ = 0.125, P0 = 0.1, averaged over 9 runs). b, Perceiver dynamics when exploring with the converged primitives and novelty management. Mean perceiver error as a function of time for the converged and random actors (same parameters as in a). Insets, Perceiver state at different times, where black denotes the probability of walls, green denotes the probability of no wall, and thickness denotes the distance from probability = 0.5. c, Exploration behavior of a novel circular arena for the converged exploration primitives and novelty management, where color denotes the active primitive (black, retreat; blue, loop 1; red, loop 2; magenta, loop 3; cyan, loop 4) and time progresses from top to bottom. Left, Zoom in on the first steps, where the light blue line denotes orientation of the mouse. Middle, Initial exploration in which only loops 1 and 2 are active. Right, Exploration of open space with loops 3 and 4 until reaching the center of the arena. d, Phase plane of model parameters, where regions were automatically discovered via clustering of the actor probabilities. Distances from cluster centroids are plotted as a function of the two free parameters of the model: σ and P0, where red/green/blue channels denote distance from centroids of clusters 1, 2, and 3. Capital letters (A–K) denote the entire set of mice (n = 11) described in Fonio et al. (2009), positioned in the phase plane such that their behavior best correlated with the behavior generated by the model, given the corresponding parameters. e, Transition trajectories of the experimental mice and matching model agents. Durations of exploration in each behavioral phase, normalized by their mean time, are depicted by their occurrence sequence. Dashed curves represent the behavior of individual mice (Fonio et al., 2009), and solid curves represent the behavior of model agents whose parameters are marked in d.

  • Figure 4.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 4.

    Developmental parameters produced by the model (σwhisker = 0.5; pobj = 1; σlocomotion = 0.125; P0 = 0.05). Whisker data have 1.6 × 105 time steps, automatically segmented to 941 entries, then grouped to 21 equal-sized bins, corresponding to developmental days. Locomotion data have 1.6 × 106 time steps, automatically segmented to 3375 entries, then grouped to 21 equal-sized bins. a, Appearance of whisker motion patterns (retraction/protraction/whisking), which were calculated only for loop 1 (no objects): no movement, normalized whisker angle <0.25; retraction, retraction from base state; whisking, continued protraction followed by full retraction, with amplitude >0.5 the normalized angle; protraction, otherwise. Protraction was never a result. b, Amplitude of whisker movement, calculated as the maximal normalized angle per entry. Comparison between a single linear fit (solid) and piecewise linear fits (dashed). c, Appearance of locomotion patterns (lateral/forward). Model patterns: No movement, entry duration <0.6αt; Forward, forward motion consists >55% of actions; Lateral, otherwise.

  • Figure 5.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 5.

    Novelty management principles in the whisker (Deutsch et al., 2012) and locomotion (Fonio et al., 2009) systems. a, Top, Example of whisking trajectory (i.e., angle as a function of time; red circles denote contact with a pole. Bottom, Novelty flow calculated from the trajectory; red crosses denote maximal novelty flow. Vertical lines denote whisk (excursion) beginning. b, Top, Example of locomotion trajectory, described in normalized polar coordinates of angle (blue) and radius (red) in a circular novel arena. Bottom, Novelty flow calculated from the trajectory; red crosses denote maximal novelty flow. Vertical lines denote entry (excursion) beginning. c, d, Difference between the novelty SNR of experimental and control animals in the whisking (c) and locomotion (d) systems; averaged over sessions (whisking system), animals (locomotion system), and 20 repetitions per session/animal for the controls (see text). Error bars denote SEM, ***p < 0.001. e, f, Dynamics of inbound movements, time aligned to the last point of maximal novelty flow, where for each data point in each excursion we calculated its spatial distance from the starting point of the excursion (error bars denote SEM). e, Change in angle in the inbound portion. f, Change in Cartesian distance from the home cage in the inbound portion. g, Percentage of excursions according to first (left), second (middle), and third (right) visited novelty zones: the exit from the home cage is the High novelty zone (red); the circumference of the arena is Medium novelty zone (purple); the open space is the Low novelty zone (blue).

  • Figure 6.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 6.

    Behavior upon first touch in life with a vertical metal pole during an exploration excursion out of the home cage. a, Top, An example of 3 successive palpations. Numerals and colors denote which whisker column, either on the left (L) or right (R) side, touched the pole, where gray (midline) denotes contact with the nose. The white curve represents the distance of the center of the head from the object as a function of time, overlaid with the contact events displayed above. Bottom, Example images from the recorded films at different times. Snout (white contour) is automatically tracked, whereas the stationary object (green contour) is manually marked. Left, The mouse is distant from the pole and does not touch it. Middle, The mouse is lightly touching the pole (blue circle). Right, The mouse is touching the pole with its nose. b, Comparison of the sum of contact durations between experiment and model during the first and second palpations. Experimental results present the average over 11 mice (error bars denote SEM, **p < 0.02), model results averaged over 10 runs with accumulated novelty wth uniformly drawn from the range of [8,9] bits and σwhisker = 0.001. Model times were normalized by first palpation duration. c, The first sequence of contact durations of a mouse as a function of contact number. The dashed red vertical line denotes the end of the first palpation episode. d, Same as c averaged over all mice (n = 11). Black line denotes exponential fit, aen/b (a = 6.54 ms, b = 7.37 contacts), where dashed red vertical line represents b.

Back to top

In this issue

The Journal of Neuroscience: 34 (38)
Journal of Neuroscience
Vol. 34, Issue 38
17 Sep 2014
  • Table of Contents
  • Table of Contents (PDF)
  • About the Cover
  • Index by author
  • Advertising (PDF)
  • Ed Board (PDF)
Email

Thank you for sharing this Journal of Neuroscience article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
Emergent Exploration via Novelty Management
(Your Name) has forwarded a page to you from Journal of Neuroscience
(Your Name) thought you would be interested in this article in Journal of Neuroscience.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Print
View Full Page PDF
Citation Tools
Emergent Exploration via Novelty Management
Goren Gordon, Ehud Fonio, Ehud Ahissar
Journal of Neuroscience 17 September 2014, 34 (38) 12646-12661; DOI: 10.1523/JNEUROSCI.1872-14.2014

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Respond to this article
Request Permissions
Share
Emergent Exploration via Novelty Management
Goren Gordon, Ehud Fonio, Ehud Ahissar
Journal of Neuroscience 17 September 2014, 34 (38) 12646-12661; DOI: 10.1523/JNEUROSCI.1872-14.2014
del.icio.us logo Digg logo Reddit logo Twitter logo Facebook logo Google logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • Abstract
    • Introduction
    • Materials and Methods
    • Results
    • Discussion
    • Footnotes
    • References
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF

Keywords

  • active sensing
  • hierarchical model
  • intrinsic motivation
  • reinforcement learning
  • whisker system

Responses to this article

Respond to this article

Jump to comment:

No eLetters have been published for this article.

Related Articles

Cited By...

More in this TOC Section

Articles

  • Choice Behavior Guided by Learned, But Not Innate, Taste Aversion Recruits the Orbitofrontal Cortex
  • Maturation of Spontaneous Firing Properties after Hearing Onset in Rat Auditory Nerve Fibers: Spontaneous Rates, Refractoriness, and Interfiber Correlations
  • Insulin Treatment Prevents Neuroinflammation and Neuronal Injury with Restored Neurobehavioral Function in Models of HIV/AIDS Neurodegeneration
Show more Articles

Behavioral/Cognitive

  • Learning a Model of Shape Selectivity in V4 Cells Reveals Shape Encoding Mechanisms in the Brain
  • A Fluid Self-Concept: How the Brain Maintains Coherence and Positivity across an Interconnected Self-Concept While Incorporating Social Feedback
  • A Texture Statistics Encoding Model Reveals Hierarchical Feature Selectivity across Human Visual Cortex
Show more Behavioral/Cognitive
  • Home
  • Alerts
  • Visit Society for Neuroscience on Facebook
  • Follow Society for Neuroscience on Twitter
  • Follow Society for Neuroscience on LinkedIn
  • Visit Society for Neuroscience on Youtube
  • Follow our RSS feeds

Content

  • Early Release
  • Current Issue
  • Issue Archive
  • Collections

Information

  • For Authors
  • For Advertisers
  • For the Media
  • For Subscribers

About

  • About the Journal
  • Editorial Board
  • Privacy Policy
  • Contact
(JNeurosci logo)
(SfN logo)

Copyright © 2023 by the Society for Neuroscience.
JNeurosci Online ISSN: 1529-2401

The ideas and opinions expressed in JNeurosci do not necessarily reflect those of SfN or the JNeurosci Editorial Board. Publication of an advertisement or other product mention in JNeurosci should not be construed as an endorsement of the manufacturer’s claims. SfN does not assume any responsibility for any injury and/or damage to persons or property arising from or related to any use of any material contained in JNeurosci.