Abstract
Prospective (forward) temporal–spatial models are essential for both action and perception, but the literature on perceptual prediction has primarily been limited to the spatial domain. In this study we asked how the neural systems of perceptual prediction change, when change-over-time must be modeled. We used a naturalistic paradigm in which observers had to extrapolate the trajectory of an occluded moving object to make perceptual judgments based on the spatial (direction) or temporal–spatial (velocity) characteristics of object motion. Using functional magnetic resonance imaging we found that a region in posterior cerebellum (lobule VII crus 1) was engaged specifically when a temporal–spatial model was required (velocity judgment task), suggesting that circuitry involved in motor forward-modeling may also be engaged in perceptual prediction when a model of change-over-time is required. This cerebellar region appears to supply a temporal signal to cortical networks involved in spatial orienting: a frontal-parietal network associated with attentional orienting was engaged in both (spatial and temporal–spatial) tasks, but functional connectivity between these regions and the posterior cerebellum was enhanced in the temporal–spatial prediction task. In addition to the oculomotor spatial orienting network, regions involved in hand movements (aIP and PMv) were recruited in the temporal–spatial task, suggesting that the nature of perceptual prediction may bias the recruitment of sensory-motor networks in orienting. Finally, in temporal–spatial prediction, functional connectivity was enhanced between the cerebellum and the putamen, a structure which has been proposed to supply the brain's metric of time, in the temporal–spatial prediction task.
Introduction
Time, the fourth dimension, is central to both action and perception. In action, we need to predict how the state of the external world will evolve to coordinate our movements – to catch a ball, the arm must be moved to intercept it at a future position. In perception, the role of temporal information is less obvious but no less central: We need to extrapolate how objects in the visual world will move over time to “keep tabs” on multiple objects as our attention shifts to other objects and back again (Michotte, 1954; Pylyshyn and Storm, 1988). In fact, continuous spatiotemporal trajectory is one of the strongest and earliest developing cues to visual object-hood (Xu, 1999; Flombaum et al., 2004).
In the examples above, temporal information is used prospectively, in conjunction with spatial information, to extrapolate how the external world will change over time. In other words, a forward model of the external world is formed in four dimensions. Whereas the concept of a four-dimensional forward model is familiar in the motor-control literature (Wolpert and Miall, 1996; Kawato et al., 2003), the literature on perceptual prediction has been primarily limited to the spatial domain.
Outside the motor-control literature, the concept of prediction is closely associated with that of attentional orienting. In orienting paradigms [e.g., the Posner task (Posner, 1980)], the observer has certain expectations about the spatial location or other feature of a perceptual event. However, most attentional orienting tasks either fix or randomize the intervals between stimuli to control or avoid response-preparation confounds. Temporal information is therefore explicitly controlled or excluded from the perceptual predictions.
In contrast, a few studies have manipulated temporal predictions directly. Using a temporal version of the Posner paradigm, Coull and Nobre (1998) showed that anterior-inferior parietal cortex and ventral premotor cortex, areas typically involved in preparation of manual responses, were engaged in temporal orienting (Coull and Nobre, 1998). Assmus et al. (2003) observed activity in anterior-inferior parietal cortex when participants predicted the trajectory of two balls to judge whether they would collide, a task which requires the integration of spatial and temporal information.
In this study we explicitly compare two tasks in which a prospective model of perceptual events is required: one in which temporal information must be incorporated into the model, and one which is limited to the spatial domain. We developed a novel paradigm similar to the “real-world” scenarios described above, in which participants used their observations of a moving object to extrapolate its trajectory. Participants made perceptual judgments which either required integrated spatial and temporal information (judgments of velocity), or which required only spatial information (judgments of direction). We asked how the pattern of brain activity changed when the perceptual model had to represent change over time.
Materials and Methods
Participants
Participants were 12 young adults (age 21–33, mean 26.6 years), 4 men and 8 women. All were right handed. The methods and procedures used in the study had approval from the Northwestern University Institutional Review Board. Participants gave written informed consent before the study and were paid $40 for their time.
Stimuli and cognitive task
The stimulus display is illustrated in Figure 1. On each trial, the target started in the center of the screen and moved toward the periphery. After 800–1700 ms of observed motion (mean 1200 ms), the target became “invisible,” appearing to slide under an occluding surface (the edge of which was marked with a dotted line on screen) and continued, invisibly, along its trajectory for 600 ms; during this period participants had to use their knowledge of the direction and/or velocity of motion to extrapolate the movement of the target. After the period of invisible motion, the target reappeared and participants were asked to make perceptual judgments about how its motion had deviated from their expectations (see below). Participants were forced to construct a model of target motion with new parameters on each trial, because the direction and velocity of motion was different on each trial. Five speeds of motion from 2.5° to 5.3° visual angle per second were used (2.5°, 3.06°, 3.67°, 4.42° and 5.3°); each speed was 125% of the previous speed); the direction of motion was one of 12 trajectories equally spaced in a 360° range (±15°, ±45°, ±75°, ±105°, ±135°, and ± 165° from the vertical).
During the occluded period, the target's motion was slightly perturbed. In the temporal–spatial condition, the velocity of the target was slightly increased or decreased, so it reappeared slightly too far or not far enough along its trajectory compared with its predicted position based on motion extrapolation. In the spatial condition, the direction of motion of the target was perturbed, so that it reappeared shifted slightly to the left or right of its predicted position. We can draw the distinction between temporal–spatial versus spatial because a model of target motion, including a temporal dimension, would be required to predict how far the target should have traveled, whereas purely spatial knowledge would be sufficient to judge a leftwards or rightwards shift in trajectory.
Participants responded with a nonspeeded choice button-press response, with the right index or middle finger. In the case of the temporal–spatial (velocity judgment) condition, the two buttons indicated whether the target had traveled “too far” or “not far enough”; in the spatial (direction judgment condition) the buttons indicated “clockwise” or “anti-clockwise” – the direction in which the trajectory had been shifted. Participants were reminded of the response contingencies by an on-screen prompt, which also indicated the time to respond.
Note that there was no extra information about the timing of the response available in the temporal–spatial condition compared with the spatial condition. In both conditions the duration of “invisible” motion was always predictable (600 ms), and the response prompt was always delayed 350 ms after the reappearance of the target – so participants knew when they would be responding in both conditions. Furthermore, participants were instructed to emphasize accuracy over reaction time. This is an important caveat because if the timing of the response had been more predictable or more salient in the temporal condition it could be argued that task differences in activation related to response preparation rather than perceptual extrapolation.
Participants maintained central fixation throughout the task.
Experimental procedure
All participants completed a behavioral training session and an fMRI session. The training session took place 24–48 h before the fMRI session.
Training session.
The objectives of the training session were: (1) To allow participants to reach a stable level of performance, and (2) To titrate the task parameters to match task difficulty (defined in terms of accuracy rate) between the temporal–spatial and spatial conditions for each individual participant. Difficulty was defined as the magnitude of the change in velocity (defined as a percentage of the observed velocity) in the temporal–spatial task or the magnitude of change in direction (defined as an angle of displacement) in the spatial task; smaller changes were harder to discriminate. To match task difficulty between tasks, we first established each individual participant's accuracy on the temporal–spatial task with a fixed velocity change of ± 30%. We then used a staircase algorithm to find the angle of displacement in the spatial task that gave the same accuracy level. This meant that task parameters were different for each individual, but that task difficulty was the same across individuals. In the course of the training session, participants completed ∼150 trials of each task (depending on performance).
During the training session participants were trained to perform the task with central fixation. Eye movements were monitored online with a remote video-based infrared eye-tracker, and participants were reminded to maintain central fixation each time they made an eye movement. All participants were able to maintain central fixation during task performance, and occasional eye movements were very rare (< 1% of trials).
fMRI session.
In the fMRI session, participants completed eight blocks of each task with five trials per block (total block length 15–20 s). Blocks were randomly ordered and interspersed with 25-s rest blocks. Each task block was preceded by a 5-s visual cue indicating the condition (temporal–spatial or spatial) for the following block. During the fMRI session task parameters were fixed. The change in velocity during the occluded period was ± 30% of the original velocity. The trajectory was altered by an angle of 2.5° to 6.8° (this value was adjusted for each individual to match difficulty between tasks; see above, Training session). The initial (preocclusion) velocities and directions of movement were the same as in the training block. Eye movements were monitored online with a remote video-based infrared eye-tracker; all subjects successfully maintained central fixation throughout.
fMRI methods
Image acquisition.
Data were acquired using a 3-Tesla Siemens (Munich, Germany) Trio whole-body MRI system using a birdcage head coil. Images were acquired using echo-planar T2*-weighted imaging. 52 × 3 mm trans-axial slices were acquired with an interleaved ascending sequence, covering the whole brain and including the cerebellum, with a voxel size of 3 × 3 × 3 mm [TR, 2.670 s; TE, 20 ms; flip angle, 80°; matrix, 64 × 64]. Each task consisted of 315 image sets. The first five image sets were collected in the absence of any task to allow the signal to reach a steady state, and were excluded from additional processing and analysis.
Individual T1-weighted structural images were acquired for each participant in 176 × 1 mm transaxial slices [TR, 2.000 s; TE, 4.38 ms; flip angle, 8°; matrix, 256 × 256; FOV, 100%].
Preprocessing.
Data were preprocessed and analyzed using statistical parametric mapping (SPM2, Wellcome Department of Cognitive Neurology, London, UK) implemented in MATLAB 6.5.1 (MathWorks, Natick, MA) running within a KDE Linux operating system. Images were corrected for differences in slice-timing and then realigned and unwarped to correct for movement artifacts. High-resolution anatomical T1 images were coregistered with the realigned functional images to enable anatomical localization of the activations. Structural and functional images were spatially normalized into a standardized anatomical framework using the default EPI template provided in SPM2, based on the averaged-brain of the Montreal Neurological Institute and approximating the normalized probabilistic spatial reference frame of Talairach and Tournoux (1988). Functional images were spatially smoothed using a 7 mm Gaussian kernel. The resulting spatial resolution was ∼10 mm3 full-width at half-maximum. The time series was temporally filtered to eliminate contamination from slow drift of signals (high-pass filter, 200 s) and corrected for autocorrelations using the AR(1) model in SPM2.
Cerebellar normalization.
We used a separate normalization process for data from the cerebellum. Using a standard whole-brain normalization process, registration between individuals and MNI space is suboptimal in the cerebellum (Diedrichsen, 2006). Because cerebella vary relatively little between individuals compared with the cortical landmarks used for whole-brain normalization, it is possible to achieve a much better registration by normalizing the cerebella separately. Good spatial registration is important because cerebellar structures are small relative to cortical structures.
We used the SUIT toolbox (Diedrichsen, 2006) for SPM2 to normalize each individual's structural scan to an infra-tentorial template, and then used the resulting deformation maps to normalize the cerebellar sections of each person's functional images. The SUIT toolbox has the additional advantage that coordinates can be adjusted from MNI space to the corresponding coordinates on the un-normalized Colin-27 brain [a very high-resolution structural MRI of one individual (Holmes et al., 1998)], which is described anatomically in a cerebellar atlas (Schmahmann et al., 2000). We used this feature to identify anatomical regions within the cerebellum.
After this stage, we performed parallel but identical statistical analyses on the functional data for the whole-brain and cerebellar normalized images.
fMRI analysis
Statistical analyses were implemented in the General Linear Model using SPM 2. We used a box-car model to represent the task blocks, with the canonical HRF as the basis function. Four explanatory variables (EVs) were included per session – two for the main task (temporal–spatial and spatial) and two for the task-instruction periods.
Individual T-contrasts were generated for each of the tasks against the resting baseline, and by comparing the tasks directly (temporal–spatial condition – spatial condition, and vice versa). These were entered into a second-level group analysis. First, to identify all areas which were activated by our tasks, we made separate whole-brain contrasts to compare activation in the temporal–spatial and spatial tasks with baseline, at a corrected threshold [equivalent false-discovery rate (FDR) 0.05; minimum cluster size, 10 voxels].
Second, we asked which areas were more strongly activated in one task or the other. To do this we entered the individual contrasts (spatial–temporal − spatial, and vice versa) into a group t-contrast. For these contrasts we used the low threshold of p < 0.05 uncorrected, to maximize sensitivity and therefore detect any areas preferentially activated by one of the tasks: because the two tasks were very closely matched, we expected only subtle differences in activation. However, we masked the contrasts to include only those voxels which were significantly activated at the corrected (FDR <0.05) threshold in one or other of the tasks in the first place (compared with rest): thus all the differences reported were within areas significantly associated with the experimental paradigm at a corrected threshold. These contrasts were also masked to exclude deactivations relative to baseline in the reference condition (e.g., the contrast temporal–spatial − spatial was masked with spatial greater than zero).
Thirdly we asked whether there were any areas in which activity was specific to one task or the other. We defined task-specific activations as those activations passing a high threshold in one task (FDR <0.05), with no activity in the other task, even at a low threshold (p < 0.05 uncorrected).
Functional connectivity analysis.
Finally, to probe how the task-specific activations we observed in the cerebellum related to activity in the cerebral cortex, we performed an analysis of psychophysiological interactions (PPI) (Friston et al., 1997). Specifically, we asked where in the brain there was a change in interaction with the posterior cerebellum, when temporal information was incorporated into the perceptual prediction. The logic of a PPI analysis is that, if two brain areas A and B interact in a task-specific manner, the regression of activity in area B on activity in area A (or vice versa) may change between task conditions. A PPI analysis implemented in the GLM framework in SPM searches for voxels displaying this property, in relation to some region of interest, throughout the whole brain.
To generate psychophysiological interaction regressors, we created volumes of interest (VOIs) corresponding to the focus of cerebellar activation (in lobule VII crus I, in the left cerebellar hemisphere) in each individual subject's functional data. VOIs were constrained to lie in lobule VII crus 1 on the individual structural scan, and to have an x-coordinate between −8 and −20. We extracted the time course of activity from each VOI, corrected for effects of interest. These time courses were de-convolved to remove the effect of the HRF (Gitelman et al., 1999), multiplied with the psychological regressor of interest (a 1, − 1 contrast between the temporal–spatial and spatial tasks) and re-convolved with the canonical HRF. Finally the PPI regressors, together with regressors representing the main effects of task and the time course of the VOI, were entered into a whole-brain analysis using the general linear model in SPM2. Group t-contrasts for the PPI regressor versus baseline were produced at a threshold of p < 0.05 uncorrected. This low threshold was used to minimize false negatives, because the power of PPI analysis in a nonfactorial experiment is low compared with the power of main-effects analysis (Friston et al., 1997). PPI results are presented for the whole brain, and were not masked with the main effect of task, because valid PPI effects can arise in the absence of a main effect.
Additional behavioral experiment
In the two tasks, temporal–spatial and spatial, the stimuli which participants observed were very similar. The difference between the tasks was determined by the judgments the participants made, which required them to focus either on the direction of motion (spatial task) or velocity (temporal–spatial task) during observation and to extrapolate these properties. To confirm that participants did indeed adjust their behavior according to the task instructions, we conducted an additional behavioral experiment in which we manipulated task set to find out whether there was a cost of being asked about the un-cued dimension (cf. Coull et al., 2004).
Participants.
The participants were 20 young adults, 11 men and 9 women, recruited from the Oxford University community and paid £10 for their time. The study was approved by the Ethics Committee of the University of Oxford.
Cognitive task.
Participants performed blocks of the temporal–spatial or spatial tasks, as before. However, the target's motion was now perturbed both in terms of velocity (as in the temporal–spatial task) and direction (as in the spatial task) and participants could be asked about either of these properties.
At the start of each block of 12 trials, the participant was informed of the probability that they would be asked about direction of motion (spatial task) or velocity (spatial–temporal task) by a symbolic cue. The probabilities were 100% spatial–temporal, 75% spatial–temporal and 25% spatial, 50–50, 25% spatial–temporal and 75% spatial, and 100% spatial. The dimension being probed was indicated by the response prompt (i.e., participants only learned which dimension they would actually have to respond to on that particular trial after the reappearance of the target). Participants responded with a choice button-press, and were instructed to emphasize accuracy over reaction time, as in the main experiment.
All participants completed two sessions: a calibration session and a test session. In the calibration session, participants completed separate blocks of 150 trials of each of the tasks. During this session, the parameters (angle of displacement or percentage velocity change) were adjusted using a staircase design to give 80% accuracy on each task; thus, the parameters differed slightly across individuals. In the test phase, participants completed 25 blocks of 12 trials. Trial types were mixed within each block in different proportions to give the five conditions described above. The order of conditions was randomized. Accuracy and reaction time were recorded for each trial.
Results
Behavioral results
Main experiment
Performance in the fMRI session was well matched on the two tasks in terms of accuracy (mean = 78% for the temporal–spatial prediction task, 81% for the spatial prediction task, paired samples t test, t(11) = 0.82 p = 0.43) and reaction time (mean = 403 ms for the temporal–spatial task, 431 ms the spatial prediction task, t(11) = 0.91, p = 0.38), indicating that the titration of difficulty was successful and transferred well to the scanner session. The displacement used in the spatial condition ranged from 2.3° to 6.8°, mean 4.5°.
The fact that performance was well matched between tasks is significant because otherwise it could have been argued that the temporal–spatial task was more difficult than the spatial task, and that task difficulty accounted for increases in activity in the temporal–spatial task. Because performance was equivalent across tasks, any increases in activity must relate to a difference in the nature, rather than difficulty, of the tasks.
Task-set experiment
To confirm that participants were focusing on different aspects of the target's motion in the two tasks, we performed an additional behavioral experiment in which we manipulated task set parametrically. We found that there was a cost in both accuracy and reaction time when participants were focusing on one dimension and then were asked to make judgments about the other (Fig. 2). Accuracy increased parametrically with the certainty that the dimension in question would be probed. In a repeated-measures ANOVA, there was a significant linear effect of task expectation on accuracy (for spatial trials, F(1,19) = 36.4 p < 0.0005; for spatial–temporal trials, F(1,19) = 6.09, p = 0.023). There was also a significant linear effect of task expectation on reaction time (for spatial trials, F(1,19) = 222.3 p < 0.0005; for spatial–temporal trials, F(1,19) = 122.8, p < 0.0005). Because participants were instructed to emphasize accuracy over speed of response, the reaction time effect probably reflects response uncertainty rather than speed of processing.
The accuracy and reaction-time costs were symmetrical – there was as much cost when participants were cued to the spatial–temporal task and then the spatial task was probed as vice versa. This is intriguing because it could be argued that the cognitive processes involved in the spatial task are simply a subset of those involved in the temporal–spatial task, but the present result suggests a more symmetrical relationship across tasks, with some enhancement of spatial prediction when participants focused on the direction of motion.
fMRI results
Task-specific activation of the cerebellum
There was only one area observed to be specific to one of the tasks: the posterior cerebellum, which was only active in the temporal–spatial prediction condition (Table 1). We defined task-specific activations as those activations passing a high (equivalent FDR <0.05) threshold in one task, with no activity in the other task, even at a low threshold (p < 0.05 uncorrected). The activity extended along lobule VII, crus I of the left cerebellar hemisphere, and into the same lobule in the right cerebellar hemisphere. In the left cerebellar hemisphere, there were two local maxima within the activation: one in the medial part of the cerebellar hemisphere and one in the lateral part. This pattern was reflected in the right hemisphere, although the right-lateral peak did not pass the corrected threshold.
A distinction between posterior and anterior cerebellum, proposed on the basis of patient studies, is that the posterior cerebellum is involved in nonmotor aspects of cognition, whereas the anterior cerebellum is more involved in motor functions (Schmahmann and Sherman, 1998). The posterior cerebellum, including crus I, has connections to association cortex including prefrontal cortex, whereas connections with motor cortex project to the anterior cerebellum (Kelly and Strick, 2003). Crus I (activated here) has connections to dorsolateral prefrontal area 46, in a closed loop with the thalamus and the dentate; other regions of prefrontal cortex may also have connections with the posterior cerebellum, which are at present uncharted. Because our task involved perceptual prediction rather than motor planning, the activation of the posterior cerebellum fits in with this anterior-posterior distinction.
Shared and task-biased activity in the cortex
Shared activation pattern.
In both the temporal–spatial and spatial tasks compared with baseline, there was activity in parietal regions (extending along the length of the intraparietal sulcus in both hemispheres), frontal eye fields and premotor cortex bilaterally, dorsolateral prefrontal cortex bilaterally and presupplementary motor area, and V1/V2 bilaterally. This pattern of activity could be summarized by saying that it resembled the “frontal-parietal network” which has been associated with visuospatial orienting (Mesulam, 1990; Gitelman et al., 1999). In addition to the “classic” frontal parietal network, we observed bilateral activation in area MT, an area involved in visual motion processing which can be modulated by attention to visual motion (Büchel et al., 1998).
The engagement of this frontal–parietal oculomotor network in orienting, together with other task-appropriate areas, has been observed in a variety of spatial and nonspatial orienting tasks, including orienting to semantic categories (Cristescu et al., 2006) and to scenes in long-term (Summerfield et al., 2006) or working memory (Lepsien et al., 2005). The activations in each task compared with the low-level baseline are presented side-by-side in the supporting online material (supplemental Table 1, available at www.jneurosci.org as supplemental material).
Task-related enhancement of activity in the cerebral cortex.
Although the dominant pattern of activity in the cortex was shared between tasks (as would be expected given that the two tasks were very similar), there were some regions in which activity was disproportionately increased in the temporal–spatial task, revealed in the simple contrasts temporal–spatial > spatial and vice versa (Table 1; Fig. 3).
Enhanced activity in the temporal–spatial task.
In the temporal–spatial task, enhanced activity occurred in anterior inferior parietal cortex and ventral premotor cortex, two areas associated with object-directed grasping. In anterior inferior parietal cortex, the activity was bilateral and fell in a region of parietal cortex which is thought to be homologous to macaque area aIP [the correspondence is indicated by their similar pattern of connections as revealed by diffusion-weighted imaging (Rushworth et al., 2005), and by receptor cytoarchitectonics (Caspers et al., 2006)]. Functional imaging studies suggest that the region plays a similar role in humans to in macaques, being involved in object-directed grasping with the hand (Binkofski et al., 1999; Shikata et al., 2001; Grefkes and Fink, 2005). Activity was also increased in the temporal–spatial task in the ventral premotor cortex and adjacent area 44. This region is also associated with object-directed grasping and has corticocortical connections with aIP (Rushworth et al., 2005).
There was also enhanced activity in the temporal–spatial prediction task in the dorsolateral prefrontal cortex [area 46 (Petrides and Pandya, 1999)] and in the anterior presupplementary motor area. Both dlPFC (Lewis and Miall, 2006) and pre-SMA (Coull et al., 2004) have been associated with processing of temporal information. However, these areas have also been implicated in many other “higher” cognitive functions, notably in the initiation of self-generated action (Jahanshahi et al., 1995; Lau et al., 2004); thus, the areas in which activity was enhanced in the temporal–spatial task have all been associated with action and/or motor planning, as well as with temporal processing, in the literature.
All of the enhancements of activity relating to temporal–spatial predication were stronger in the right hemisphere. A similar pattern of activity was present in the left hemisphere but did not pass threshold. This is intriguing because previous work on temporal prediction/orienting has shown a left-hemisphere bias in activation (Coull and Nobre, 1998; Assmus et al., 2003). The right lateralization in the present task may reflect right-hemisphere dominance for spatial attention (because integration of spatial and temporal information is key here), or for perceptual as opposed to motor tasks.
Enhanced activity in the spatial task.
We performed the converse contrast, spatial > temporal–spatial. However, there were no voxels in which activity was significantly higher in the spatial task than the temporal–spatial task (contrast spatial > temporal–spatial) at the threshold of p < 0.05 uncorrected. This is slightly surprising because in the behavioral task-set experiment, we found that there was a behavioral benefit of task set for the spatial, rather than spatial–temporal task. The lack of spatial-specific areas supports the hypothesis that temporal–spatial prediction uses much the same neural mechanisms as spatial prediction, but with recruitment of additional areas as described above.
Functional connectivity results
We used psychophysiological interactions (PPI) to ask which areas in the brain showed functional connectivity with the area of task-specific activation in the cerebellum (lobule VII crus 1) (Fig. 4, Table 2). By investigating the interaction between the task contrast temporal–spatial>spatial and activity in the cerebellar region engaged in the temporal–spatial task, we asked which brain areas increase their interaction with that cerebellar region when temporal information must be incorporated into a perceptual prediction.
The cortical areas which showed functional connectivity with the cerebellum reflected the spatial orienting network engaged by both tasks: The intraparietal sulcus, and the frontal eye fields, extending into dorsolateral prefrontal cortex, all bilaterally, showed functional connectivity with the posterior cerebellum. In the parietal lobe, there were regions of functional connectivity both in the anterior inferior parietal cortex (the region which showed enhanced activity in the timing task), and along the length of the IPS, reflecting the pattern of activity which was equally strong in both the spatial and temporal–spatial tasks. As well as the spatial attention network, the cerebellar VOI showed functional connectivity area MT bilaterally, a region involved in processing of visual motion, which was engaged by the main effect of both tasks.
Although the functional connectivity analysis was seeded at a region which was only engaged when temporal predictions were required, the regions in which functional connectivity was observed were not those which showed enhanced activity in the temporal–spatial prediction task alone, but rather those areas engaged by both tasks. This suggests that, rather than engaging separate networks, spatial and temporal–spatial perceptual prediction both engage the frontal-parietal spatial orienting network, but when temporal information must be integrated into the model, interaction with the cerebellum becomes important.
The posterior cerebellum also showed functional connectivity with the anterior putamen bilaterally. The putamen has been implicated in timing through pharmacological evidence (for review, see Matell and Meck, 2004). Recent anatomical evidence shows that the cerebellum has connections with the basal ganglia including the striatum (Hoshi et al., 2005), raising the intriguing possibility that these two sets of structures, implicated in timing literature, interact in this type of temporal prediction task.
The putamen has been proposed as the source of the brain's timing signal, possibly as part of a medial–frontal–striatal loop (Macar et al., 1999; Coull et al., 2004). Alternatively, Ivry and Spencer (2004) have proposed that the role of the basal ganglia in timing is to set the threshold for releasing an action or making a decision; in timing tasks, that threshold may be a point in time. The anatomical evidence (Hoshi et al., 2005) is for a projection from the cerebellum to the putamen (disynaptically via the thalamus); the return connection would be less direct, probably via the globus pallidus, cortex and pons. The PPI between cerebellum and putamen is therefore more readily interpreted in terms of Ivry's “gating” model than the frontal–striatal clock model. However, our methods cannot reveal the direction of information flow and therefore cannot distinguish between these hypotheses.
No main effect of task was observed in the putamen, unlike the spatial orienting network described above. This suggests that whatever role function the putamen plays in temporal–spatial prediction, it is equally active in all task conditions (and rest), but its interaction with the cerebellum increases when a temporal–spatial forward model is required.
Lateralization
Note that the PPI effects were predominantly bilateral. The connections between cerebellum and cerebrum are crossed, so we might have expected to observe stronger PPI effects in the right cerebral hemisphere, using a seed VOI in the left cerebellar hemisphere. However, note that PPI effects do not necessarily reflect a direct anatomical connection, but rather a functional network; information could pass between hemispheres either in the cerebral cortex, or in the cerebellum, because the pattern of activity in the cerebellum was essentially bilateral.
Discussion
We compared two tasks in which observers needed to form a three-dimensional or four-dimensional prospective perceptual model to extrapolate different aspects of object motion. By asking the participants to make judgments about either direction (which could be done using only spatial information) or velocity (which required incorporation of temporal information into the model) we were able to ask how brain systems involved in perceptual prediction change when change over time must be represented.
The cerebellum in perceptual forward modeling
When the task included a temporal dimension, a region of the posterior cerebellum (lobule VII, crus I) became active. The cerebellar activation was specific to the task-blocks in which participants made velocity judgments and therefore had to incorporate temporal information into their perceptual predictions. When they made judgments of direction, which required only spatial information, the cerebellum was not active. This posterior region of the cerebellum has been associated with cognitive (nonmotor) processing (Schmahmann and Sherman, 1998), and has connections with prefrontal rather than motor cortex (Kelly and Strick, 2003).
The importance of the cerebellum in motor timing is well documented. Cerebellar dysfunctions such as dysdidachokinesia have traditionally been explained in terms of the temporal (mis) coordination of muscle-groups (Holmes, 1939). Cerebellar damage also affects the timing of actions in relation to external events: When cerebellar patients throw a ball, their inaccuracies in aim arise from mis-timing the release of the ball in relation to the arm movement (Timmann et al., 1999). Computationally, the cerebellum is thought to be involved in forward-modeling of motor behavior, and these forward models have a strong temporal aspect (Miall et al., 1993; Wolpert and Miall, 1996); the architecture of the cerebellum lends itself to timing functions (Braitenberg, 1967).
The role of the cerebellum as a forward-modeling device extends to canceling the sensory consequences of actions (Miall et al., 1993; Ramnani, 2006). For example, when people tickle themselves, suppression of the ticklish sensation is associated with increased activation in the cerebellum (Blakemore et al., 1999). Again the temporal dimension is integral to the model: introducing a delay between the action and sensory feedback reinstates ticklishness (Blakemore et al., 2001). Cerebellar cancellation of the expected results of movement may be important in sensory discrimination – for example in tactile exploration with the fingers (Gao et al., 1996).
The cerebellum has been implicated in timing through a series of patient studies by Ivry and colleagues (Keele et al., 1985; Spencer et al., 2003; Zelaznik et al., 2005), who have suggested that the cerebellum is particularly involved in event timing (Ivry et al., 2002) – prospective control of the timing of discrete responses. This type of timing would require temporal forward models similar to those required to extrapolate the temporal–spatial trajectory of perceptual stimuli in the present experiment.
The current results demonstrate that the role of the cerebellum in forward-modeling extends into the perceptual domain. The posterior cerebellum is involved in prospective modeling of purely perceptual stimuli, but only when a model of change over time is required. The fact that the temporal–spatial condition engaged the cerebellum whereas the purely spatial condition did not is intriguing, because it suggests that the type of forward-modeling circuitry used in motor control may be engaged in perceptual processing specifically when temporal information is incorporated into the model.
Cerebellar–cortical interactions in temporal prediction
The cerebellum becomes involved in perceptual prediction when a model extended in time is required. How does this cerebellar activity relate to the cerebral cortex? A functional connectivity (PPI) analysis indicated that in the temporal–spatial prediction task the cerebellum interacts with a cortical network of areas involved in spatial orienting (Gitelman et al., 1999). This frontal-parietal network, including IPS and FEF and in this case MT, was engaged equally by the spatial and temporal–spatial prediction tasks, but its interaction with the cerebellum was enhanced when predictions had a temporal dimension.
This finding suggests that the circuitry involved in perceptual prediction in spatial tasks can integrate a temporal signal from the cerebellum. Recently, unit-recording studies have revealed that firing rate can be tuned to temporal expectancies, or hazard functions, in areas more traditionally associated with spatial information processing, including LIP (Leon and Shadlen, 2003; Janssen and Shadlen, 2005), V1 (Shuler and Bear, 2006) and V4 (Ghose and Maunsell, 2002). Here we see that intraparietal regions, including LIP, have enhanced interaction with the cerebellum when a temporal aspect is added to perceptual prediction, suggesting a link between the cerebellar and cortical literatures – perhaps a cerebellar signal temporally tunes the activity of neurons in these cortical regions.
Note that the suggestion that the cerebellum provides prospective signals about the timing of events does not necessarily imply that the cerebellum is the seat of an “internal clock.” The present results suggest that the cerebellum is involved in using temporal information, but the metric of time could be generated here or elsewhere. The question of how the brain keeps time is unresolved; one popular hypothesis is that there is a general-purpose timing system, relying on corticostriatal circuits (Coull et al., 2004; Lustig et al., 2005). Other investigators favor a distributed representation of time in terms of neural-network states (Karmarkar and Buonomano, 2007).
Effector systems in perceptual prediction
As well as enhanced interaction between cortex and cerebellum, we observed a shift in cortical activation patterns. In the temporal–spatial prediction task, there was an enhancement of activity in two regions not generally associated with spatial attention: the anterior inferior parietal region (aIP) and ventral premotor cortex (PMv).
Both PMv and aIP have been implicated in tasks in which participants form expectations about the timing of future events. PMv is engaged when people observe or reproduce temporal sequences (Schubotz and von Cramon, 2001; Schubotz et al., 2003). aIP has been implicated in a perceptual task when participants integrated spatial and temporal information to predict whether two moving objects would collide (Assmus et al., 2003). Both regions are active during attentional orienting to points in time (Coull and Nobre, 1998).
In the present experiment, the engagement of PMv and aIP in the temporal–spatial task seems to represent a shift between two frontal-parietal networks, relating to different effector systems. Both PMv and aIP are involved in producing hand movements; the oculomotor network was also engaged, but equally in the two tasks. The premotor theory of attention (Rizzolatti et al., 1987) argues that the oculomotor network is engaged in spatial attention because predictions about the positions at which stimuli will appear are manifested as prepared saccades. Perhaps the differential involvement of reaching systems in temporal and nontemporal perceptual predictions arises because temporal prediction is simply more important for limb movements. Saccades are made toward a precise point in space, but are generally not directed toward a point in time – when people look at moving objects, they make a saccade with a short, stereotyped time course toward one point on its projected path, followed by smooth pursuit eye movements [although this process can be temporally tuned (Medina et al., 2005)]. In contrast, grasping movements have much more flexible time courses, the duration of movement is non-negligible, and the movements must be timed to intersect with objects in the outside world, which are often moving themselves. Therefore it could be argued that timing is naturally more important for hand movements than for eye movements.
Conclusions
The cerebellum is involved in forward-modeling of purely perceptual events, if and only if a model of change over time is required. Although this is analogous to the role of the cerebellum in motor planning, functional connectivity analysis indicates that in the perceptual case the cerebellar timing signal projects to regions involved in perceptual orienting. Interaction between the cerebellum and the putamen was also enhanced in temporal–spatial prediction.
A working hypothesis might be that forward-modeling circuitry in the cerebellum is used to predict events in the temporal dimension. The temporal prediction generated in the cerebellum could be used to temporally tune the activity of cortical neurons representing spatial predictions (that is, in IPS and FEF). Thus the neural substrate of the four-dimensional forward model would be corticocerebellar interaction. Note that as PPI analysis cannot tell us the direction of information flow, interference studies will be required to clarify the interactions between areas, for example, is temporal tuning of cortical neurons disrupted in animals with cerebellar lesions?
To date, multiple literatures related to timing have developed in relative isolation, focused on different brain regions. Patient work has implicated the cerebellum in event timing; pharmacological work has implicated the striatum; and electrophysiological studies have shown temporal tuning of neuronal activity in diverse cortical areas. The present findings begin to forge a link between these diverse findings by showing interactions between cerebellar, cortical and basal ganglia regions in a naturalistic temporal–spatial prediction task.
Footnotes
-
This work was partly funded by a Wellcome Trust 4-year Ph.D. studentship awarded to (J.X.O.R.) and National Institutes of Health Grant NS30863 from the National Institute of Neurological Disorders and Stroke. Thanks to Kate McCarthy and Brittany Lapin for help with data collection, Joeran Lepsien and Joern Diedrichsen for help with fMRI data analysis, and Dick Passingham for useful comments.
- Correspondence should be addressed to Jill X. O'Reilly, Department of Experimental Psychology, University of Oxford, South Parks Road, Oxford OX1 3UD, UK. joreilly{at}fmrib.ox.ac.uk