Positron emission tomography (PET) was used to monitor regional cerebral blood flow variations while subjects were constructing mental images of objects made of three-dimensional cube assemblies from auditorily presented instructions. This spatial mental imagery task was contrasted with both passive listening (LIST) of phonetically matched nonspatial word lists and a silent rest (REST) condition. All three tasks were performed in total darkness. Mental construction (CONS) specifically activated a bilateral occipitoparietal–frontal network, including the superior occipital cortex, the inferior parietal cortex, and the premotor cortex. The right inferior temporal cortex also was activated specifically during this condition, and no activation of the primary visual areas was observed. Bilateral superior and middle temporal cortex activations were common to CONS and LIST tasks when both were compared with the REST condition. These results provide evidence that the so-called dorsal route known to process visuospatial features can be recruited by auditory verbal stimuli. They also confirm previous reports indicating that some mental imagery tasks may not involve any significant participation of early visual areas.
- spatial mental imagery
- dorsal visual pathway
- frontal cortex
- parietal cortex
- occipital cortex
- position emission tomography
- cerebral blood flow
The significance of mental imagery in human cognition results from the capacity of this process to reactivate previous visual experience in a quasi-perceptual format. Visual images reflecting objects or spatial configurations are accessible to conscious inspection and can be externalized in particular in the form of verbal reports (Paivio, 1986). A recently emphasized feature of visual imagery is its capacity to build mental representations of objects that have never been experienced perceptually but have been described verbally. In such cases, the generation of visual images does not result from the reactivation of previously stored memories but does result from on-line construction of internal representations on the basis of the processing of verbal instructions and their encoding in a visuospatial format. Although such images may lack detail or vividness, they have been shown to reflect properties similar to those of images based on perceptual experience. In particular, cognitive operations performed on images, such as mental scanning or distance comparisons, exhibit chronometric patterns similar to those executed on images that reactivate stored visual information (Denis et al., 1995; De Vega et al., 1996).
Previous functional neuroanatomy studies using either positron emission tomography (PET) or functional magnetic resonance imaging (FMRI) have considered the brain regions involved in mental imagery (Kosslyn et al., 1993; Le Bihan et al., 1993; Mellet et al., 1995;Roland and Gulyas, 1995; Kosslyn et al., 1996). They have shown that images generated from previously memorized percepts activate regions engaged in visual perception, giving anatomical support to the analogies between perception and its mental equivalent. Moreover, in a recent report, we showed that mental exploration of a previously learned visual configuration involves regions belonging to the dorsal route (Mellet et al., 1995). This result concurs with the specific role of the parieto-occipital cortex in the spatial treatment of mental images (Levine et al., 1985; Farah et al., 1988) and may indicate that mental imagery is subject to the same dichotomy that was evidenced in the visual system between dorsal and ventral anatomo-functional pathways respectively specialized in the processing of spatial and figurative attributes of visual stimuli (Mishkin et al., 1983; Haxby et al., 1991, 1994).
It remains to be seen whether a highly specialized visual treatment route can be mobilized by an acoustico-verbal input. This should provide additional insight into the brain regions that are engaged when transmodal processing is required in real time. To evaluate whether the cerebral structures involved in the processing of the spatial aspects of visual perception (Mishkin et al., 1983; Haxby et al., 1991) could take part in a purely mental activity in which spatial information would be available in a lexical shape only, we set up a PET activation protocol in which the subjects had to construct “on-line” a mental object from spatial instructions supplied verbally.
MATERIALS AND METHODS
Subjects. Nine right-handed healthy French male students (ages, 22.7 ± 1.3 years; mean ± SD) participated in this study. All were free from nervous disease or injury and had no abnormality on their T1-weighted MRIs. Informed written consent was obtained from each subject after the procedures had been explained fully. Approval of these experiments was given by the Kremlin-Bicêtre Ethics Committee.
To ensure optimal homogeneity of the sample of the subjects with respect to their imagery abilities, subjects were selected from a group of 106 male subjects on the basis of their scores on the Minnesota Paper Form Board (MPFB) (Likert and Quasha, 1941) and the Mental Rotations Test (MRT) (Vandenberg and Kuse, 1978). The MPFB mean score of the nine selected volunteers was 22.7 ± 1.9 (mean ± SD; whole population, 19.8 ± 4.1), and their MRT mean score was 15.6 ± 1.9 (whole population, 12.0 ± 4.2).
Experimental protocol. Using 15O-labeled water, we made six sequential PET measurements of the Normalized regional Cerebral Blood Flow (NrCBF) of each subject, replicating a series of three experimental conditions: a spatial mental construction (CONS) task and two control conditions, namely a passive listening (LIST) task and a silent rest (REST).
CONS task. During the CONS task, the subjects were requested to build four three-dimensional mental objects made out of twelve cubes assembled on one or two of their sides (Shepard and Metzler, 1971). Each three-dimensional cube assembly was described by a list of 11 directional words. The lists were generated randomly using the six directional French words haut (up), bas (down),droite (right), gauche (left), avant(front), and arrière (back). Lists corresponding to planar objects or to objects containing a loop were discarded. Before the PET experiment, subjects were first presented with referential axes (Fig. 1, top) corresponding to the six directional words, and the CONS task was explained. The task itself consisted first of mental visualization of one cube, which served as the starting point of the construction, and then 11 other cubes were added according to a list of 11 directional words given verbally through earphones at 0.5 Hz (Fig. 1). At the end of the mental construction of the object, the subjects had to visualize the entire object for 5 sec and then delete it from their minds before again visualizing the starting cube and building the next object from another list of directional words.
This procedure was repeated four times with four different mental objects for each NrCBF measurement. No explicit memorization instruction was given. Different objects were used during the replication condition so that the eight objects built during the two CONS conditions were different.
This spatial imagery task was designed to elicit a visuospatial processing while at the same time limiting the lexico-semantic treatment, because verbal instructions were limited to six words indicating directions. The unfamiliar and unusual character of the mentally assembled objects as well as the fact that they were constructed on-line ensured that the mental images thus generated were not originating from visual memory.
Assessing the CONS task execution. In this study, no output was expected from the subjects while they were engaged in the CONS task. Because of this, one cannot assess in real time that the task is executed correctly but must rely on postacquisition debriefing and testing. Accordingly, within 2 min after the completion of a CONS task PET acquisition, subjects were presented with drawings of four series of four different objects (Fig. 2) and were asked to identify the series they had just built. Each of the four series was made of the four objects of the original series the subject had just built (e.g., ABCD); the objects in each of the four presented series were placed in different order, namely ABCD, BACD, ACBD, and CADB. In all but one of the series (CADB), there was at least one object in the same position as in the original series so that the subjects had a 0.25 probability of choosing by a mere chance this series instead of another one.
In addition, at the end of the entire PET experiment, subjects were debriefed on the strategy they had used in performing the postacquisition tests (verbal or by a comparison with the mental image). Additional items of the debriefing session consisted of subjective evaluation of task difficulty (on a five-point rating scale), reports on any intentional explicit memorization activity of the entire object, characteristics of the mental image (vividness, position in the mental visual field), characteristics of the cubes (color, size, opacity), reports on any unintentional operations on the mental object (displacing, rotating, zooming, refreshing), and the nature of the subjects’ mental activity during both control conditions.
Control task 1: passive listening of words list. In the first control condition (LIST), subjects were instructed to listen attentively to word lists presented through earphones at the same frequency as that used in the CONS condition (0.5 Hz). The word lists for this control condition were obtained by replacing each directional word of the lists used during the CONS condition with a phonetically close French word having no directional content and low imagery value. The words were: taux (rate), cas (case),moite (wet), roche (stone), amant(lover), and amer (sour). Therefore, subjects listened to the exact same number of phonetically equivalent words presented at the same rate during both the CONS and the LIST conditions.
Control task 2: REST. In the second control condition (REST), no instructions were given to the subjects except that they were not to move. This baseline control condition is the usual reference condition used in our laboratory.
Image data acquisition. For each NrCBF measurement, thirty-one 3.375-mm-thick contiguous brain slices were acquired simultaneously on an ECAT 953B/31 PET camera with a 5 mm in-plane resolution (Mazoyer et al., 1991). A black chamber was set up all around the PET tomograph so that PET data were acquired in total darkness, with subjects’ eyes closed, in all conditions. To assess eye movements, horizontal electro-oculograms were recorded for each subject, using external electrodes placed at the external canthi and a right ear reference electrode. Emission data were acquired with septa extended. Tasks were started 30 sec before the intravenous bolus injection of 60 mCi of 15O-labeled water. A single 80 sec scan was acquired and reconstructed (including a correction for head attenuation using a measured transmission scan) with a Hanning filter of 0.5 mm−1 cutoff frequency and a pixel size of 2 × 2 mm2. The between-scan time interval was 15 min; a Latin square design was used to define the condition order.
Data analysis. Statistical parametric maps (SPMs) corresponding to comparisons between the CONS, LIST, and REST tasks were generated with the three-dimensional version of SPMs (Friston et al., 1995). The original brain images were transformed into the standard stereotactic Talairach space (Talairach and Tournoux, 1988). Global differences in the NrCBF within and between subjects were removed by scaling, and comparisons across conditions were made by way of t statistics. As indicated above, the experimental protocol was designed so that both the REST and the LIST conditions were used as control conditions for the CONS task. Thus, significant increases compared with either the REST (CONS vs REST) or LIST (CONS vs LIST) control conditions were used to uncover the activation specific to the CONS task. In addition, the contrast between the LIST and REST conditions was studied to check areas specifically involved in lexico-semantic processing of words. For each comparison, the voxel amplitude t map was transformed in a Z volume that reached threshold at Z0 = 3.1, which corresponds to a 0.001 confidence level (without correction for multiple comparisons). We have also reported the significant decreases uncovered by the REST versus CONS comparison and the REST versus LIST comparison.
We believed that it would be worthwhile to compare the results of the present study with data on mental spatial exploration reported recently from our laboratory (Mellet et al., 1995). In the previous study, the mental imagery task consisted of the mental exploration of the visual image of a previously presented spatial configuration. Subjects were asked to execute this task in total darkness without any time constraint, in contrast to the classic mental scanning paradigm that also calls for mental exploration (Kosslyn et al., 1978; Denis and Cocude, 1992). The control condition was the condition described above as REST. This study was carried out with another sample of eight subjects with high visuospatial abilities and with the same PET and data acquisition scheme; however, PET data were analyzed using a region-of-interest analysis method. To compare the two mental spatial imagery tasks properly, these previous data were thus reanalyzed using the same SPM approach as that used in the present report.
CONS task execution
It is noteworthy that the subjective details of the imagery activity during the construction of mental objects differed only slightly from one subject to another, considering the fact that no instructions were given regarding the qualitative aspects of the image to be built. According to postexperimental reports, the subjects created vivid images that enabled them to visualize quite clearly the entire cube assemblies. One subject generated only images detailed just enough to individualize each cube that composed the object. None of them produced a colored image.
All subjects but one moved the mental object during its construction in such a way that its center remained in the center of the mental visual field. All subjects but one performed frequent scanning of the object to maintain its overall shape. None of the subjects performed mental rotation of the objects. Two of them sometimes zoomed the object, but the size of the objects was the same with all subjects, and they systematically filled the entire visual field. Notwithstanding the absence of memorization instructions, four subjects declared that they tried to keep in mind one or more mental objects. They described this activity as marginal and as not having hampered the execution of the task.
During the post-PET recognition task, the nine subjects reported that they used the mental image of the objects to identify the series they had just built; none of them declared that they made use of a verbal strategy. Among the 18 post-PET recognition tasks (two per subject), only one (5.5%) resulted in the choice of the CADB series in which no object stood in its right place (p = 0.057). One can therefore infer that all subjects built sufficiently accurate images to be used in the recognition task.
The average amplitudes of horizontal eye movements during CONS, LIST, and REST conditions were 3.6 ± 1.8° (mean ± SD), 3.2 ± 1.8°, and 3.1 ± 1.6°, respectively, with no significant difference between them (ANOVA for repeated measures;p = 0.13). Similarly, the average frequencies of eye movements during these conditions were of 1.6 ± 1.3, 1.1 ± 0.9, and 1.2 ± 0.8 Hz, respectively [not significantly different (p = 0.15)].
Cerebral blood flow variations
As indicated above, five comparisons were performed: CONS versus REST, CONS versus LIST, LIST versus REST taken to reveal activations, and REST versus CONS and REST versus LIST to uncover NrCBF decreases. The stereotactic coordinates and spatial extent of the activated and deactivated areas are given in Tables 1–5. The corresponding Z volumes for activations are shown in Figure 3.
This comparison revealed a significant and extensive occipitoparietal area of activation with local maxima located in the superior occipital gyri and in the inferior parietal lobule. An additional area of activation, albeit of smaller size and amplitude, was observed in the left fusiform gyrus.
In the frontal lobe, bilateral activations were observed in the lateral premotor region and in the supplementary motor area, both corresponding to Brodmann’s area 6. A small focus of activation was also observed in the right inferior frontal cortex.
The temporal lobe showed a bilateral flow increase located in the superior and middle temporal gyri in the left hemisphere.The pattern of activation was rather different in the right hemisphere: the activated area extended to the inferior temporal gyrus where a local maximum was found.
No activation or deactivation was observed in the primary visual areas or nearby cortices.
Taking the LIST condition as a reference, we found a similar pattern of activation in the occipitoparietal and frontal cortices during the CONS task. Contrast of these two conditions revealed a large area of activation encompassing the bilateral superior occipital gyri and parietal lobules. Local maxima within this area were located in left and right superior occipital gyrus and in the right precuneus. An additional focus of increased blood flow was found in the right supramarginalis gyrus.
The lateral premotor cortex and the supplementary motor area were also activated. As anticipated, no activation of the middle and superior temporal gyri was found in this contrast. The only significant focus of activation was found in the right inferior temporal gyrus, confirming increased regional flow during the CONS condition.
Again, no activation or deactivation was observed in the primary visual areas or nearby cortices.
This contrast revealed bilateral activations in the middle and superior temporal gyri when subjects listened to word lists. For both hemispheres, the maximum value voxel was located in the middle temporal gyrus. In the left hemisphere, the temporal activation pattern was similar to that described in the CONS condition. In the right hemisphere, however, there was no activation of the inferior temporal gyrus during the LIST condition.
Additional foci of activation were observed in the left inferior frontal gyrus, corresponding to Broca’s area, and in the right parahippocampal gyrus.
REST versus CONS (Table 4, top)
This comparison exhibited regions that were more active during the resting state than during the CONS condition, thus reflecting the NrCBF decreases during this last condition. The most significant decreases during CONS as compared with REST were observed in the medial part of the brain, namely in the medial superior frontal gyrus, the midcingulate, the posterior cingulate, and the paracentral lobule. Weaker deactivations were also observed in the left hemisphere, namely in the left middle temporal gyrus, the left central sulcus, the left insula, the left lingual gyrus, and the left inferior frontal gyrus.
REST versus LIST (Table 4, bottom)
In contrast to the strong blood flow reductions reported above, NrCBF decreases in the LIST condition as compared with REST were relatively few and involved the left middle occipital gyrus, the left premotor cortex, the right parieto-occipital sulcus, the left supramarginal gyrus, and the postcentral gyrus.
SPM analysis of CONS versus REST (Table 5, Fig. 4)
The SPM analysis of the data of our previous report is presented in Table 5 and Figure 4. Recall that in the previous study, the mental image originated from the reactivation of a visually memorized configuration. When compared with REST, the maximum flow increases were located in the premotor regions of the frontal lobe, extending to the more medial parts corresponding to the supplementary motor area. A parieto-occipital region was also activated bilaterally, extending from the parietal lobules to the superior occipital gyrus. The voxel of maximum activation was located in the precuneus on the internal face of the parietal lobe. A smaller amplitude activation was also detected in the left fusiform gyrus and the bilateral inferior temporal gyri.
Specialization of the visual and spatial routes involved in mental imagery
The task used in the CONS condition was designed to call strongly on visual imagery. The data that were collected during the post-PET task assessment and during debriefing sessions indicated that all subjects had indeed performed the task. Its execution elicited activation of posterior regions clearly distributed along an occipitoparietal axis. Similar results were obtained by the reanalysis of the mental exploration protocol. These superior occipital and parietal regions constitute the dorsal route, the role of which has until now been described only in the spatial processing of external visual stimuli (Mishkin et al., 1983; Haxby et al., 1991). The results of the present study, as well as those of our previous mental exploration protocol (Mellet et al., 1995), demonstrate that this route is also involved in the processing of nonperceptual spatial information. During CONS, the processing of words with a spatial semantic content resulted in the activation of the occipital and parietal regions, whereas the spatial information originally was available in a verbal form. On the other hand, mental exploration was performed on an image reactivated from a representation held in the visual memory. The same sensory modality thus was involved in the original input and in the mental image. Overall, our results indicate that the involvement of the dorsal route does not depend on the original input modality but does depend on the spatial nature of the information processed.
Mental construction, and to a lesser extent mental exploration, also elicited ventral activations located in the inferior temporal gyrus. The CONS task designed for this study required an implicit retention of visual attributes of each mental object both to complete the construction and to mentally visualize the whole object. One can assume that this visual mental image had to be kept in memory long enough for it to be used in the post-PET matching task. The mental exploration task did not require the encoding but did require the mnemonic recall of a visual representation. This implication of the visual memory may be reflected by the inferior temporal activation; this region is known to be involved in visual memory processes in both humans and monkeys (Miyashita, 1993). Moreover, there have been several reports of activation of inferior temporal lobes during PET studies on mental imagery (Kosslyn et al., 1993; Roland and Gulyas, 1995), which indicates that this structure could also play a role in the maintenance of mental images.
Mental imagery and verbal representations
In the CONS condition, the subjects used acoustico-verbal information to assemble units into three-dimensional objects that had no physical counterparts. Because verbal and visual representations are different cognitive entities (Paivio, 1986), this operation implies on-line translation of the semantic content of verbal stimuli into picture-like representations. The flow increases in the superior and middle gyri observed during both the CONS and the LIST conditions likely reflect the lexico-semantic treatment required in these two cases (Petersen et al., 1988; Wise et al., 1991; Price et al., 1992;Guaraglia et al., 1993; Mazoyer et al., 1993). The locations and intensities of these activations were remarkably similar in these two conditions. Although it requires the semantic processing of the word, the CONS task did not implicate language areas in a more extensive manner than the LIST did. Rather, an activation of Broca’s area was detected during LIST conditions, congruent with previous PET studies (Mazoyer et al., 1993), with no equivalent during CONS conditions. The lack of Broca’s area activation during the latter task could be related to the nature of the words that were used. Only a few spatial words exist compared with the abundance of substantives available for object description (Landau, Jackendoff, 1993). It has been suggested that the category of spatial words could be processed by motor and spatial systems, namely the premotor and occipitoparietal cortex, rather than by classical language areas (Landau and Jackendoff, 1993;Jeannerod, 1994).
The CONS and LIST conditions elicited very different patterns of decreases when compared with REST. Although the physiological significance remains unclear, the medial frontal and posterior cingulate deactivation observed during CONS were close to those observed previously during visual matching tasks (Haxby et al., 1994) or visual memory tasks (Moscovitch et al., 1995; Courtney et al., 1996), in agreement with the visuospatial nature of our task. On the other hand, the NrCBF decreases evidenced during the LIST condition were located in the visual associative areas (disregarding the primary visual area) and in the premotor cortex. This fact may reflect the cross-modal inhibition described previously in visual areas when a somatosensory stimuli was applied (Kawashima et al., 1993; O’Sullivan et al., 1994).
Mental imagery and visual areas
The dorsal route was activated in the absence of any visual perceptual input and was not accompanied by any activation of the primary visual areas. These results revive the discussion on the role of “top-down” activations during mental imagery activity. Recent related literature illustrates a debate about the involvement of the primary visual areas (PVAs) in mental imagery tasks performed in total darkness (Kosslyn and Ochsner, 1994; Moscovitch et al., 1994; Roland and Gulyas, 1994). In short, the REST condition used as a reference condition by the groups that do not evidence activation of PVA is suspected to be the cause of this absence of activation. Spontaneous imagery activity during the REST condition is thought to hide any involvement of the primary visual cortex during the activation condition (Kosslyn et al., 1996). This objection, however, is not valid in the present work because two reference conditions were used. No activity of the primary visual area was noticed, whether the CONS condition is compared with the REST condition or with the listening of abstract word lists. Methodological arguments are thus not sufficient to explain the absence of any activation of early visual areas in our studies.
It may be that only the figural aspects of mental imagery could be involved in the primary visual areas. Neuropsychological studies of unique cases have demonstrated the existence of a double dissociation between the object discrimination and the spatial localization in imagery tasks (Levine et al., 1985; Farah et al., 1988). Therefore, it is possible for our CONS task, which is spatial in nature, to be executed without resorting to the primary visual cortex.
It seems doubtful therefore that a classical procedure of the “bottom-up” type limited to the visual modality (where information would be treated from “low level” to “high level” regions) could be invoked for the recruitment of the dorsal route during the CONS task. Our results are more in favor of a hypothesis in which temporal areas of language can guide the information directly toward the associative visual areas or premotor areas without going through the primary cortex. As shown in this study, the passage from a verbal representation to a visuospatial representation involves structures specialized in the treatment of each type of modality, with both structures being necessary and sufficient for changing the nature of the information.
In the present study as well as in our previous mental exploration study, we observed bilateral premotor activation concurrent with the involvement of the dorsal route. This activation cannot be attributed to an increase of oculomotor activity during the CONS task, because the amplitude and the frequency of eye movements did not show any significant differences when compared with both control conditions. The construction of a mental object requires that after transcoding the semantic information into a spatial representation, the relative localization of each cube is upheld on-line so that the object is assembled correctly. The role of the visuospatial sketch pad, one of the working memory components, is to maintain visuospatial information in the short term (Baddeley, 1992). Its involvement in the CONS task could result in the simultaneous activation of the parietal cortex and the premotor lateral cortex. In fact, coupled activations of the parietal and premotor cortices have been described in visuospatial tasks, such as spatial localization (Haxby et al., 1994) or shifting of spatial attention (Corbetta et al., 1993), and in situations explicitly involving the spatial working memory (Jonides et al., 1993; Courtney et al., 1996). They were described recently in a study on the execution of prelearned sequences of eye saccades in total darkness (Petit et al., 1996), in which it was emphasized that this frontoparietal interaction is not dependent on a perceptual activity. Our results demonstrate that an exchange of information between the premotor and the parietal areas is also necessary when the visuospatial stimulus is processed only mentally. They also mean that this interaction is independent from the execution of a motor activity. It is likely that the parietal “perceptual” pole and the frontal “motor” pole systematically exchange spatial information, whether a motor action is envisioned or not, thus executing the encoding of a spatial environment in its descriptive and behavioral aspects. Note that because of its poor temporal resolution the PET technique is unable to answer the question of whether the information upcoming from auditory cortex is first sent to the premotor cortex and then to the occipitoparietal cortex or whether remote visual areas first receive the information that is then conveyed to the premotor cortex. Such an interaction between perceptual and motor components in the treatment of spatial information has been postulated in the visual perception domain (Mesulam, 1990). The fact that this coupled activation was also detected during the mental exploration task indicates that this interaction was not specific for the spatial working memory but could also be required in the scanning of a mental image reactivated from long-term memory. The exchange of information between the premotor regions and the dorsal route then appears as a general feature during spatial processing, whatever the nature of the initial input.
In conclusion, this study demonstrates that the involvement of the occipitoparietal–frontal network for spatial processing is not bound to the modality under which information is delivered. This large-scale neural network (Mesulam, 1994), made of visual unimodal and heteromodal associative regions, can operate on nonsensory inputs and be engaged no matter what way the information is delivered, and it can participate in either the mental scanning of a mental image or the purely imagined creation of mental objects.
This work was supported in part by a grant from the Programme Cognisciences of the Centre National de la Recherche Scientifique, Axe Thématique National “Représentation de l’Espace.” We are indebted to L. Laurier for her invaluable help with data acquisition and to the Orsay chemistry staff for tracer production. We also thank L. Petit for thoughtful comments.
Correspondence should be addressed to Professor Bernard Mazoyer, Groupe d’Imagerie Neurofonctionnelle, Groupement d’Intérêt Public Cyceron, Boulevard Becquerel, BP 5229, F-14074 Caen Cedex, France.