Research ReportA dynamic model for action understanding and goal-directed imitation
Introduction
Humans and other primates are very good in recognizing and understanding goal-directed actions of conspecifics. This cognitive capacity is crucial for any social interaction because it enables the observer to adjust responses accordingly. The advantages of being able to predict consequences of an ongoing action of another individual are obvious in cooperative and competitive situations defining the social life of groups. What are the brain mechanisms underlying the capacity to recognize and understand actions displayed by others? Recent behavioral and neurophysiological evidence suggests that the neuronal structures involved in action production are to a large extent also activated during action observation. Action understanding might thus be based on a direct matching to the motor commands that an individual may use to reproduce observed actions and their consequences. Rizzolatti and colleagues have forwarded this “direct matching hypothesis” based on their finding of mirror neurons in premotor and parietal areas of monkeys (di Pellegrino et al., 1992, Rizzolatti et al., 1996; for a recent review, see Rizzolatti et al., 2001). Mirror neurons respond either when the animal produces a given action or observes the experimenter or another monkey performing a comparable action. Importantly, the actions able to trigger mirror neurons must involve goal-directed behavior such as, for instance, the grasping or placing of an object. Mirror neurons thus seem not to code for the movement per se but for the purpose of the movement. During the past decade, neurophysiological evidence has been accumulated which support the existence of a mirror system matching action observation and action execution also in humans. Moreover, the findings of several brain imaging studies have been taken as evidence that the circuit active during action observation roughly corresponds to the homologues circuit of mirror neurons in the monkeys (Iacoboni et al., 1999, Rizzolatti et al., 2001).
As pointed out by Rizzolatti and colleagues, the suggested functionality of the mirror system provides a natural link between action understanding and imitation. In imitation, the motor description of, for instance, an observed grasping-placing behavior may be turned into an overt action when the response is allowed. Understanding the significance of the action (“placing an object at a new position”) is important since otherwise the reproduced action would represent for the imitator nothing more than a series of meaningless gestures. A lack of understanding limits of course the capacity to apply or adapt the reproduced behavior in a new context.
There are, however, several findings indicating that an explanation of action understanding purely based on a simple and direct resonance phenomenon of the motor system is likely to be incomplete. Humans and also monkeys are able to infer action goals without a full visual description of the action (due to occluding surfaces for instance) by combining partial visual and additional contextual information (e.g., Assad and Maunsell, 1995, Filion et al., 1996, Umiltà et al., 2001). Similarly, it has been shown that infants at the age of 18 month are already able to act on a goal that they had to infer because the demonstrator “accidentally” failed to achieve the end-state of the action (Meltzoff, 1995). Obviously, in both the hidden condition and the error condition, a direct mapping from perception to action is not sufficient to explain the goal-directed behavior of the imitator. Experiments with adults as models and children as imitators challenge in general the direct mapping hypothesis. Very often, a mere copy of the surface behavior displayed by the adult may not be appropriate or may even be impossible due to very different limb and body sizes. Children may nevertheless show their understanding of the task by reproducing the end-state using their own means. In a series of imitation tasks involving hand actions of different complexity, Bekkering and colleagues systematically investigated how the goal of an action (such as touching a dot on a table) affects the mapping from perception to action (Bekkering et al., 2000, Wohlschäger et al., 2003). The fundamental finding was that children primarily focus on reproducing the goal of the action and not on reproducing the means used. However, when the children were explicitly asked to pay attention to how the demonstrator achieved the goal (e.g., left or right hand) they were able to adopt the model's strategy.
Altogether, these findings suggest that beside the mirror system the neural circuit for action understanding and imitation involves representations, which combine visual cues and contextual information to organize the means needed to achieve an intentional goal. The prefrontal cortex (PFC) has long been thought to be centrally involved in this process (Pochon et al., 2001, Quintana and Fuster, 1999; for a review, see Miller, 2000). The activity clusters reported in a recent positron emission tomography (PET) study using a goal-directed imitation paradigm fit nicely to this view (Chaminade et al., 2002). In particular, it was shown that the observation and later reproduction of only the means of a known action sequence (i.e., only the grasping but not the placing of a particular object was shown) lead to a strong activation pattern in PFC (see also Buccino et al., 2004). The authors interpret this finding as evidence for a neural processing representing an “automatic” retrieval of the goal underlying the observed action.
Here we present a dynamic model, which aims at substantiating the idea of a distributed neuronal network in which action understanding and goal-directed imitation occur within a continuous dynamic process. In its architecture, the model reflects the basic functionality of neuronal population of distinct but anatomically connected areas in the frontal, temporal and parietal cortex, which are known to be involved in action observation and action execution. Contextual information, action means and action goals are explicitly represented as dynamic activity patterns of local pools of neurons.
Specifically, we apply an imitation paradigm consisting of a grasping-placing sequence to show how the mapping from perception to action may contribute to the inference of the action goal. We also simulate how the knowledge about the action goal can be used to flexibly change between different means to reproduce the witnessed action effect. A second objective of the present modeling study is to illustrate how learning within the network can be exploited for skill growth. Here we focus on changes in environmental constraints and on observed means not in the motor repertoire of the imitator.
To directly illustrate the functionality of the dynamic model, we apply a simulator for a many degrees of freedom robot arm. The model implements a cognitive “decision module” which decides about the means the artifact uses to reproduce the observed or inferred action effect. Since we focus on the goal of the action and do not assume that demonstrator and imitator share the same embodiment, the implementation may be seen as a contribution to solving the correspondence problem that is now considered a major challenge for robot imitation (Alissandrakis et al., 2002; for a detailed discussion, see also Dautenhahn and Nehaniv, 2002).
We proceed as follows: in Section 4, we present the experimental paradigm and the overall model architecture. We also introduce the dynamic model and explain the underlying processing principles. Model predictions for variations of the basic experimental paradigm are described in Section 2. The critical discussion of a number of conceptual implications of our dynamic model is presented in Section 3.
Section snippets
Choice of means and goal inference in an imitation task
In the first simulation example shown in Fig. 1, we illustrate the behavior of the dynamic model for an experiment in which the imitator comes up with its own way of reproducing the observed action effect. The demonstrator has placed the object at the higher target combining a grip from the side (SG) and a trajectory above the bridge (AT). In the dynamic model, this information is encoded by localized activation patterns in the goal layer of PFC and in layer STS, respectively. The demonstrated
Discussion
When observing others in action with the intention to imitate the actions, we most likely do not encode the full detail of their motions but our interpretation of those motions in terms of the demonstrators' goal. The experimental literature reviewed in this article suggests the existence of a distributed representational system which allows one to “construct” the meaning of actions combining sensory evidence about environmental changes, situational context, prior task knowledge, and a matching
Experimental paradigm
For our modeling study, we adopt a paradigm, which has been developed to further investigate in experiments with humans the idea that actions are organized in a goal-directed manner (van Schie and Bekkering, in preparation). The paradigm contains an object that must be grasped and then placed at one of two laterally presented targets that differ in height. The possible hand trajectories are constrained by the fact that an obstacle in the form of a bridge has to be avoided (Fig. 9). The task
Acknowledgments
This work was supported by the European grants ArteSImit (IST-200-29689) and JAST (IST-2-003747-IP). We would like to thank Drs. Harold Bekkering, Hein von Schie, Leonardo Fogassi and Giacomo Rizzolatti for numerous discussions about this work.
References (61)
- et al.
Does the end justify the means? A PET exploration of the mechanisms involved in human imitation
NeuroImage
(2002) - et al.
The distribution of neuronal population activation as a tool to study interaction and integration in cortical representations
J. Neurosci. Methods
(1999) - et al.
Modeling parietal–premotor interactions in primate control of grasping
Neural Netw.
(1998) - et al.
Mirror neurons and the simulation theory of mind-reading
Trends Cogn. Sci.
(1998) - et al.
Banburismus and the brain: decoding the relationship between sensory stimuli, decisions, and reward
Neuron
(2002) - et al.
The cortical control of movement revisited
Neuron
(2002) - et al.
Neural representation for the perception of the intentionality of actions
Brain Cogn.
(2000) - et al.
Demystifying social cognition: a Hebbian perspective
Trends Cogn. Sci.
(2004) - et al.
The cortical motor system
Neuron
(2001) - et al.
Premotor cortex and recognition of motor actions
Cogn. Brain Res.
(1996)