Perceiving the unusual: Temporal properties of hierarchical motor representations for action perception

doi:10.1016/j.neunet.2006.02.005

Neural Networks

Volume 19, Issue 3, April 2006, Pages 272-284

https://doi.org/10.1016/j.neunet.2006.02.005 Get rights and content

Abstract

Recent computational approaches to action imitation have advocated the use of hierarchical representations in the perception and imitation of demonstrated actions. Hierarchical representations present several advantages, with the main one being their ability to process information at multiple levels of detail. However, the nature of the hierarchies in these approaches has remained relatively unsophisticated, and their relation with biological evidence has not been investigated in detail, in particular with respect to the timing of movements. Following recent neuroscience work on the modulation of the premotor mirror neuron activity during the observation of unpredictable grasping movements, we present here an implementation of our HAMMER architecture using the minimum variance model for implementing reaching and grasping movements that have biologically plausible trajectories. Subsequently, we evaluate the performance of our model in matching the temporal dynamics of the modulation of cortical excitability during the passive observation of normal and unpredictable movements of human demonstrators.

Introduction

An increased interest in computational mechanisms that will allow robots to observe, imitate and learn from human actions has resulted in a number of computational architectures that allow the matching of demonstrated actions to the observer robot's equivalent motor representations (Alissandrakis et al., 2002, Billard, 2000, Demiris and Hayes, 2002, Schaal et al., 2003). These architectures, whilst sharing common computational components such as modules for processing and classifying visual information and retrieving motor representations, differ in the way that the perceptual information is coded and classified, the organisation of the motor system, and the stage at which the motor representations are used. The final aspect, at what stage the motor representations are used, differentiate architectures that follow the general ‘observe, classify, imitate’ decomposition (Kuniyoshi, Inaba, & Inoue, 1994), from those that advocate a stronger involvement of the motor systems in the perception process, through a ‘rehearse, predict, observe, reinforce’ decomposition (Demiris and Hayes, 2002, Demiris & Johnson, 2003, Schaal et al., 2003). In the latter, the observer robot invokes its motor systems to rehearse potential actions, predicting and confirming incoming observed states during the demonstration. This approach has gained biological credibility with the discovery of the mirror system in monkeys and humans (Grezes et al., 2003, Rizzolatti et al., 1996). Not all theoretical models advocate the actual rehearsal of candidate actions as our previous work has done (Demiris & Hayes, 2002), opting instead for a weaker version of this motor theory of perception, usually termed ‘motor resonance’, in which the motor representations are retrieved through a resonance mechanism rather than a generative mechanism.

For imitation approaches that advocate the use of motor systems during the perception stage it becomes crucial to have a clear and flexible motor system organisation. Hierarchical representations, involving primitive motor structures at the lowest level, while increasing their complexity in higher levels, have been proposed (Demiris & Johnson, 2003, Wolpert et al., 2003), and tested in robotic systems (Demiris & Johnson, 2003), which successfully learned and used sequences of actions by observation. However, little has been done with respect to the temporal dimension of these representations, including how they can be coordinated, as well as their relation to biological data.

In this paper, we will examine in detail the issue of hierarchical representations, and in particular examine how higher level models can be composed from (and coordinate) lower levels primitives. Our approach will use representations based on the biologically plausible minimum variance model of movement control (Harris and Wolpert, 1998, Simmons and Demiris, 2005), which leads to principled and biologically plausible coordination of the underlying components. We subsequently compare a particular instantiation of our hierarchical attentive multiple models for execution and recognition (HAMMER) architecture (Demiris & Khadhouri, in press) for reaching and grasping actions, with transcranial magnetic stimulation (TMS) data from humans during the passive observation of grasping movements by a demonstrator (Gangitano et al., 2004).

Section snippets

Hierarchies

Hierarchies are computationally interesting since they advocate a logical representational decomposition: motor primitives at the lower levels take care of the executional details while progressively higher levels shift their emphasis towards exerting temporal, contextual and cognitive control. From a robotics point of view, this allows for easier task planning and execution. In action understanding and gesture recognition, hierarchical representations have been regularly used since they allow

Building blocks

The HAMMER family of architectures uses inverse and forward models (Karniel, 2002, Narendra and Balakrishnan, 1997, Wolpert and Kawato, 1998) as the basic building blocks. An inverse model is a module that takes as inputs the current state of the system and the target goal(s) and outputs the control commands that are needed to achieve or maintain those goal(s). The functional reverse to this concept is that of a forward model of a controlled system: a forward model is a module that takes as

The HAMMER–MV implementation

HAMMER–MV follows the general architecture of HAMMER, but uses minimum variance controllers as lower level inverse models, and coordinated combinations of these at the higher ones. We will start by giving an overview of our implementation of the minimum variance model, and show results on how it can be used to generate biologically plausible reaching trajectories. Subsequently, we will describe how we implement a particular instance of a hierarchical representation for a grasp using the minimum

Experiments

In our final set of experiments, human demonstrations of reaching actions were recorded and given as input to the 2D six degree of freedom simulated arm, controlled using HAMMER–MV. In the following sections, we will describe the visual stimuli we recorded, and the equations governing the matching of the model arm's performance against the human data.

Discussion

It has been advocated earlier in this paper that hierarchical representations can be a useful engineering tool when structuring motor systems. The HAMMER–MV implementation of this concept demonstrates why: hiding the details of the lower level details into higher level structures allows for easier task planning than that achieved with flat, non-hierarchical representations—only the details of the goal and desired task parameters need to be supplied and the higher inverse model will recruit and

Conclusions

The neurophysiological data mentioned in this paper lend support to the notion that the human brain does not passively observe actions but actively forms hypotheses and predicts forthcoming states. In Gangitano et al. (2004), it was shown that there is no temporal fragmentation of the action plan in the motor representation of the observer. The computational implementation of the HAMMER architecture described in this paper reproduced these results, using a hierarchical controller based on the

Acknowledgements

The first author acknowledges the support of UK Engineering and Physical Sciences Research Council (EPSRC Grant GR/S11305/01) and the Royal Society. The second author is supported by a EPSRC/DTA doctoral scholarship. Thanks to all the BioART members for their valuable feedback, and especially Anthony Dearden for his help in capturing and analysing the visual stimuli.

References (48)

A.H. Fagg et al.
Modelling parietal–premotor interactions in primate control of grasping
Neural Networks
(1998)
J.M. Fuster
Upper processing stages of the perception–action cycle
Trends in Cognitive Sciences
(2004)
J. Grezes et al.
Activations related to mirror and canonical neurones in the human brain: An fMRI study
Neuroimage
(2003)
A. Karniel
Three creatures named forward model
Neural Networks
(2002)
G. Rizzolatti et al.
Premotor cortex and the recognition of motor actions
Cognitive Brain Research
(1996)
J. Rowe et al.
Attention to action: Specific modulation of corticocortical interactions in humans
NeuroImage
(2002)
J. Sommerville et al.
Action experience alters 3-month-old infants' perception of others' actions
Cognition
(2005)
M.A. Umilta et al.
I know what are you doing: A neurophysiological study
Neuron
(2001)
D.M. Wolpert et al.
Multiple paired forward and inverse models for motor control
Neural Networks
(1998)
A. Alissandrakis et al.
Imitating with alice: Learning to imitate corresponding actions across dissimilar embodiments
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
(2002)

A. Billard

Learning motor skills by imitation: A biologically inspired robotic model

Cybernetics and Systems

(2000)

Demiris, J. (1999). Movement imitation mechanisms in robots and humans, PhD thesis, University of Edinburgh, Scotland,...

Demiris, Y. (2002). Imitation, mirror neurons, and the learning of movement sequences. In Proceedings of the...

Y. Demiris et al.

Imitation as a dual route process featuring predictive and learning components: A biologically-plausible computational model

Y. Demiris et al.

Distributed, prediction perception of actions: A biologically inspired architecture for imitation and learning

Connection Science

(2003)

Demiris, Y., & Khadhouri, B. (2006). Hierarchical, attentive multiple models for execution and recognition. Robotics...

D.C.V. Essen et al.

Hierarchical organization and functional streams in the visual cortex

Trends in Neurosciences

(1983)

P.F. Ferrari et al.

Mirror neurons responding to observation of actions made with tools in monkey ventral premotor cortex

Journal of Cognitive Neuroscience

(2005)

P.M. Fitts

The information capacity of the human motor system in controlling the amplitude of movement

Journal of Experimental Psychology

(1954)

T. Flash et al.

The coordination of arm movements: An experimentally confirmed mathematical model

The Journal of Neuroscience

(1985)

M. Gangitano et al.

Phase-specific modulation of cortical motor output during movement observation

Cognitive Neuroscience and Neuropsychology

(2001)

M. Gangitano et al.

Modulation of premotor mirror neuron activity during observation of unpredictable grasping movements

European Journal of Neuroscience

(2004)

G. Gergely

What should a robot learn from an infant? Mechanisms of action interpretation and observational learning in infancy

Connection Science

(2003)

C.M. Harris et al.

Signal-dependent noise determines motor planning

Nature

(1998)

Cited by (28)

Motor memory: Representation, learning and consolidation
2016, Biologically Inspired Cognitive Architectures
Citation Excerpt :
The models are predominantly expressed in the forms of differential equations (Degallier, Righetti, Gay, & Ijspeert, 2011; Forte, Gams, Morimoto, & Ude, 2012; Ijspeert et al., 2002; Kober et al., 2012). Our work also relates to research in cognitive science (Knoblich & Flach, 2001; Tenenbaum, Kemp, Griffiths, & Goodman, 2011; Wolpert & Flanagan, 2010), which suggests that hierarchical representations are best suited for computational models of human cognition (Braun, Aertsen, Wolpert, & Mehring, 2009; Braun, Waldert, Aertsen, Wolpert, & Mehring, 2010; Demiris & Simmons, 2006; Nishimoto & Tani, 2009; Tenenbaum et al., 2011). Similarly, in computer vision, the principle of hierarchical compositionality has recently also proved very successful for visual object categorization (Fidler, Boben, & Leonardis, 2008).
An efficient representation of motor system is vital to robot control and its ability to learn new skills. While the increasing sensor accuracy and the speed of signal processing failed to bridge the gap between the performance of artificial and human sensorimotor systems, the motor memory architecture seems to remain neglected. Despite the advances in robot skill learning, the latter remains limited to predefined tasks and pre-specified embodiment. We propose a new motor memory architecture that enables information sharing between different skills, on-line learning and off-line memory consolidation. We develop an algorithm for learning and consolidation of motor memory and study the space complexity of the representation in the experiments with humanoid robot Nao. Finally, we propose the integration of motor memory with sensor data into a common sensorimotor memory.
Cross-modal and scale-free action representations through enaction
2009, Neural Networks
Citation Excerpt :
In this fashion, the neuronal groups form hierarchies of different level descriptions set up from their basic neural bricks in a bottom–up fashion, in line with recent biological data supporting that the motoric system is organized into hierarchical representations (Lestou et al., 2008). Although some computational frameworks have been proposed to model hierarchical representations for action representation (Demiris & Simmons, 2006; Wolpert, Doya, & Kawato, 2003; Wolpert, Ghahramani, & Flanagan, 2001), they do not emphasize the importance of timing as the neuroscience dynamical systems viewpoints do (Edelman, 1987; Kelso, 1995; Rabinovich, Varona, Selverston, & Abarbanel, 2006; Tsuda, 1991), which we think important for its functioning. Besides it, polychronization of neural pairs might establish a “vertical association” between parallel neural processes to represent actions and to re-enact them.
Embodied action representation and action understanding are the first steps to understand what it means to communicate. We present a biologically plausible mechanism to the representation and the recognition of actions in a neural network with spiking neurons based on the learning mechanism of spike-timing-dependent plasticity (STDP). We show how grasping is represented through the multi-modal integration between the vision and tactile maps across multiple temporal scales. The network evolves into a small-world organization with scale-free dynamics promoting efficient inter-modal binding of the neural assemblies with accurate timing. Finally, it acquires the qualitative properties of the mirror neuron system to trigger an observed action performed by someone else.
The Construction of Reality in an AI: A Review
2023, arXiv
World Models and Predictive Coding for Cognitive and Developmental Robotics: Frontiers and Challenges
2023, arXiv
World models and predictive coding for cognitive and developmental robotics: frontiers and challenges
2023, Advanced Robotics
The enhancing effect of tracking gesture on visuo-spatial learning
2022, Acta Psychologica Sinica

View all citing articles on Scopus

View full text

2006 Special issuePerceiving the unusual: Temporal properties of hierarchical motor representations for action perception

Abstract

Introduction

Section snippets

Hierarchies

Building blocks

The HAMMER–MV implementation

Experiments

Discussion

Conclusions

Acknowledgements

Neural Networks

Trends in Cognitive Sciences

Neuroimage

Neural Networks

Cognitive Brain Research

NeuroImage

Cognition

Neuron

Neural Networks

Imitating with alice: Learning to imitate corresponding actions across dissimilar embodiments

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans

Learning motor skills by imitation: A biologically inspired robotic model

Cybernetics and Systems

Imitation as a dual route process featuring predictive and learning components: A biologically-plausible computational model

Distributed, prediction perception of actions: A biologically inspired architecture for imitation and learning

Connection Science

Hierarchical organization and functional streams in the visual cortex

Trends in Neurosciences

Mirror neurons responding to observation of actions made with tools in monkey ventral premotor cortex

Journal of Cognitive Neuroscience

The information capacity of the human motor system in controlling the amplitude of movement

Journal of Experimental Psychology

The coordination of arm movements: An experimentally confirmed mathematical model

The Journal of Neuroscience

Phase-specific modulation of cortical motor output during movement observation

Cognitive Neuroscience and Neuropsychology

Modulation of premotor mirror neuron activity during observation of unpredictable grasping movements

European Journal of Neuroscience

What should a robot learn from an infant? Mechanisms of action interpretation and observational learning in infancy

Connection Science

Signal-dependent noise determines motor planning

Nature

2006 Special issue
Perceiving the unusual: Temporal properties of hierarchical motor representations for action perception