When sensory input allows for multiple, competing perceptual interpretations, observers' perception can fluctuate over time, which is called bistable perception. Imaging studies in humans have revealed transient responses in a right-lateralized network in the frontal-parietal cortex (rFPC) around the time of perceptual transitions between interpretations, potentially reflecting the neural initiation of transitions. We investigated the role of this activity in male human observers, with specific interest in its relation to the temporal structure of transitions, which can be either instantaneous or prolonged by periods during which observers experience a mix of both perceptual interpretations. Using both bistable apparent motion and binocular rivalry, we show that transition-related rFPC activity is larger for transitions that last longer, suggesting that rFPC remains active as long as a transition lasts. We also replicate earlier findings that rFPC activity during binocular rivalry transitions exceeds activity during yoked transitions that are simulated using video replay. However, we show that this established finding holds only when perceptual transitions are replayed as instantaneous events. When replay, instead, depicts transitions with the actual durations reported during rivalry, yoked transitions and genuine rivalry transitions elicit equal activity. Together, our results are consistent with the view that at least a component of rFPC activation during bistable perception reflects a response to perceptual transitions, both real and yoked, rather than their cause. This component of activity could reflect the change in sensory experience and task demand that occurs during transitions, which fits well with the known role of these areas in attention and decision making.
Perception is an active process wherein the brain seeks to interpret what is giving rise to the current patterns of sensory stimulation. When that stimulation is ambiguous, perception tends to fluctuate spontaneously and unpredictably between alternative possible interpretations, an outcome dubbed bistable perception. What triggers these internally generated reorganizations of perception is as much a matter of debate today (Blake and Logothetis, 2002) as it was a century ago when Hermann von Helmholtz and Ewald Hering argued about this issue; Helmholtz attributed perceptual transitions to shifts in attention while Hering thought they are caused by sensory adaptation (Helmholtz, 1910; Hering, 1964).
Brain imaging studies in humans show that a broad network of primarily right-hemisphere frontal and parietal areas becomes active around the time of perceptual transitions (Kleinschmidt et al., 1998; Lumer et al., 1998; Sterzer and Kleinschmidt, 2007). Here, we ask how this frontoparietal activity relates to the time course of different kinds of transitions that occur during bistable perception. After all, we know that some transitions occur virtually instantaneously, with one percept abruptly replacing its counterpart, whereas other transitions comprise dynamic mixtures of both percepts for variable periods of time before one percept dominates completely (Hollins and Hudnell, 1980; Anstis et al., 1985; Brascamp et al., 2006; Klink et al., 2010), a notion we corroborate in a behavioral experiment using four different bistable stimuli (see Results, below) (Table 1). In this paper, we describe neural responses in the human brain accompanying transitions between perceptual states, taking into account the fine temporal structure of these transitions. Our results shed additional light on the functional role of frontoparietal activity and, hence, on the mechanisms underlying perceptual reorganizations during bistable perception. By way of preview, our results suggest that activity in frontoparietal areas is a consequence of perceptual transitions, not just their cause.
We scanned the brains of observers as they tracked perception of a bistable stimulus using three response keys: one for each of the two exclusive percepts and a third for perceptual mixtures. To increase the generality of our findings, we used two different bistable stimuli. In bistable apparent motion (AM) (Fig. 1a), the stimulus contains equal motion energy in two opposite directions, causing perception to waver between motion in these two directions. In addition to these two directional percepts, observers also perceive mixture states with no net motion during prolonged transitions (Anstis et al., 1985) (Fig. 1a). We also used binocular rivalry (BR) (Fig. 1b), which occurs when the two eyes view dissimilar inputs that resist fusion and, instead, compete for perceptual dominance over time. During transitions from one dominant percept to the other, observers can experience any of several forms of mixed dominance (Yang et al., 1992), including piecemeal dominance wherein one eye's image gradually replaces the other as their separating border passes over the stimulus area in a wave-like fashion (Fig. 1b) (Wilson et al., 2001).
Materials and Methods
fMRI of apparent motion
Stimuli (Fig. 1a) were sinusoidal spiral gratings of four periods per circumference, presented within an annular window (peaking at 2.7° of eccentricity), with a Gaussian fall-off (σ = 1.5°), on a gray background. The stimulus flickered in counterphase at 3–4 Hz, depending on the outcome of a pilot experiment run outside the scanner, aimed at finding the temporal frequency resulting in the clearest motion percepts. At the center of the screen appeared a small, white fixation mark (40′ angular subtense). The sign of the spiral angle of the apparent motion stimulus reversed from trial to trial to change the local orientation within the stimulus, thereby avoiding pattern-specific adaptation across trials. Stimulus contrast was 50% in the first two trials, 75% in the third and fourth trial, and 100% in the last two trials of each block. This was done to counter the increase in mixture perception that tends to occur over prolonged exposure to apparent motion (Anstis et al., 1985).
Functional imaging procedure.
Observers lay supine in a 3T Philips Achieva scanner at the Vanderbilt Imaging Center while viewing stimuli on a back-projected, gamma-linearized screen through the bore, made visible by means of a mirror attached to the coil array. Sessions consisted of three blocks of six 144 s runs (separated by rest), anatomical scans, and motion-mapping runs. At a TR of 2.0 s, this amounted to 72 TRs (after discarding the first 5 TRs worth of data to minimize T1 saturation effects). The scanning sequence was a SENSE EPI sequence, and TE was 30 ms with a flip angle of 80°. Slice thickness was 3 mm and gap size was 0.3 mm (64 × 64 voxels; field of view, 192 × 115.2 × 192 mm; whole brain). The experiment complied with ethical guidelines and was approved by the Vanderbilt Institutional Review Board. All observers gave their written consent before scanning commenced.
Six observers reported their perceptual states using three buttons, one for each of the two exclusive percepts and one for reporting the occurrence of mixture states during transitions. The first and last 12 s of each run were fixation periods.
Anatomical scans (1 × 1 × 1 mm) were automatically segmented and inflated using the FreeSurfer package for visualization (Dale et al., 1999; Fischl et al., 1999, 2004). Functional imaging data were analyzed using FEAT 5.92, part of the FSL package (Smith et al., 2004). After motion correction, brain extraction, alignment, spatial smoothing (5 mm), and high-pass filtering (Gaussian-weighted least-squares straight line fitting, with σ = 66.0 s), a general linear model (GLM) taking into account head-motion parameters and the derivative of the hemodynamic response function was fit to the data. Regression variables were based on the perceptual time course, as indicated by the button-press records. To stay close to existing literature (Lumer et al., 1998; Sterzer and Kleinschmidt, 2007), the principal GLM contrasts in the main text derive from regression variables in which transitions were modeled as 500 ms events, independent of their reported duration. If, instead, we model transitions as events that last their actual, reported duration, all results remain qualitatively the same (data not shown). The brief events used to model prolonged transitions were placed at the temporal midpoint of the reported transition. Mixed-effects, across-observer analyses were run using FLAME (level 2) (Beckmann et al., 2003; Woolrich et al., 2004) and the resulting BOLD data were projected onto a reconstructed and inflated surface using the FreeSurfer package. To assess the influence of transition duration, we contrasted each observer's transitions that lasted longer than their median duration with transitions that lasted the same as or less time than this value. For visualization (Figs. 2, 3), Z (Gaussianised T/F) statistic images were thresholded using clusters determined by Z > 2.3 (p < 0.01) and a corrected cluster significance threshold of p = 0.05.
fMRI of binocular rivalry
In addition to periods where observers tracked bistable perception, for binocular rivalry we also included periods where we simulated rivalry by recording observers' perceptual reports on trials involving genuine rivalry and then replaying these perceptual sequences physically on the screen (Fig. 3a). This approach allows one to distinguish the specific correlates of endogenously generated perceptual transitions from nonspecific activation (Lumer et al., 1998). We used two distinct replay conditions. In the first replay condition (instantaneous replay), we replayed all perceptual transitions as instantaneous events, the tactic used in several previous studies (Lumer et al., 1998; Tong et al., 1998; Polonsky et al., 2000; Sterzer and Kleinschmidt, 2007). In the second, novel replay condition (duration-matched replay), replayed transitions had the same duration as reported during actual rivalry. This involved simulating transitional mixtures by means of a wave traveling across the stimulus area, replacing one image with the other.
Stimuli (Fig. 1b), modeled after ones used previously (Haynes and Rees, 2006; Wilcke et al., 2009), were rotating square-wave gratings (0.5 c/deg) presented on a black background within an annular window with an inner and outer radius of 1.75° and 4.5°, respectively. Outside of these radii, contrast fell off linearly across 0.25°. A green grating was presented to one eye and an orthogonally oriented red grating was presented to the other eye. Fusion was aided using a white plus sign at fixation, a white box framing the stimulus (side = 9.5°), and vertical lines connecting the center of the upper and lower screen edge with the framing box. Dichoptic projection in the scanner was achieved by means of a custom setup using prisms and a septum, following Schurger (2009). The contrast (mean, 47% Michelson) and rotation speed (mean, 1.25 Hz) were established for each observer beforehand to ensure long dominance durations (5.44 s on average); the relative contrasts of the two colors were further adjusted beforehand to ensure balanced dominance. In the duration-matched replay condition, perceptual transitions were replayed as follows: a smoothed, straight boundary would sweep across the stimulus, starting from a randomly selected side and replacing one grating with the other. The durations of these simulated transitions, which varied randomly during an observation period, were identical to those previously reported under genuine rivalry. Instantaneous transitions in both duration-matched and instantaneous replay trials were simulated by smoothly decreasing the contrast of the dominant grating pattern while simultaneously increasing the contrast of the other over the course of 200 ms, following Lumer et al. (1998). Transitions that were reported as exceedingly brief (shorter than a manual reaction time of 400 ms) (van Dam and van Ee, 2005) were also replayed in this latter manner, even in the duration-matched replay condition (20.1% of all transitions). Similarly, all simulated transitions were shifted back in time by 400 ms to account for reaction time delays of manual reports during the preceding rivalry period. During replay, the stimulus was presented identically to both eyes.
Five observers reported perception using three key presses, during both genuine rivalry and replay. Runs were of either rivalry-only or rivalry-plus-replay type. All runs started with a 10 s fixation period, after which 120 s of rivalry ensued. Rivalry runs ended after a subsequent 6 s fixation period. For replay runs, the rivalry period was followed by 10 s of fixation, during which observers were informed of the upcoming replay period, and a 120 s period of replay during which we replayed the rivalry sequence just reported. After the replay period, the trial ended after 6 s of fixation.
Functional imaging procedure.
At TR = 2.0 s, we ran either 68 or 133 TRs for rivalry-only and rivalry-plus-replay runs, respectively (after discarding the first 5 TRs worth of data to minimize T1 saturation effects). All other parameters were identical between the two experiments.
The GLM contrasts for this stimulus were obtained in the same manner as those for apparent motion. For visualization, we used the same statistical thresholds as in the case of apparent motion. For ROI selection for event-triggered analyses, we used the voxels that showed significant activation (as described above) in the basic contrast (perceptual transition > baseline activation during genuine rivalry) combined with anatomical masks based on Harvard-Oxford and Jülich anatomical atlases; ROIs were defined for each observer individually. We used this particular BOLD contrast for ROI selection, both because it is broad (selecting all areas involved in perceptual transitions) and because it is neutral with regard to the comparison between the two kinds of replay conditions that was the main goal of the event-triggered analysis. The frontal areas joined for the ROI used in this analysis were frontal eye field (FEF), inferior frontal cortex (IFC), dorsolateral prefrontal cortex (DLPFC), and insula. The temporal jitter inherent in the irregular time course of bistable perception causes a uniform sampling over the TR period. Therefore, an arbitrarily spaced sampling around an event of interest can be used to create event-triggered averages from these data. We chose to pool all data in 4 s intervals surrounding around any time point and to slide this window in 0.5 s steps (Fig. 3d) or 0.25 s steps (Fig. 4). Note that, to stay close to the methodology of previous papers (Lumer et al., 1997; Sterzer and Kleinschmidt, 2007), all functional masks for event-triggered analyses were constructed by modeling transitions as 500 ms events occurring at the temporal midpoints of the reported transitions. In light of our present conclusion that frontal and parietal areas respond throughout transitions, rather than as instantaneous events, it may be preferable to model transitions as being the actual reported duration. This approach gives very similar results but reduces error bars (data not shown).
In addition to our imaging experiments, we performed a behavioral experiment to examine the time course of perceptual transitions in a set of different bistable stimuli, including the stimuli used in our present imaging experiments as well as stimuli used in previous imaging studies. In each session, observers were shown one of four different stimuli, two that caused binocular rivalry and two that caused bistable apparent motion perception. The two binocular rivalry stimuli were (1) the orthogonal gratings used in our BR imaging experiment and (2) a green face presented to one eye paired with a horizontal square-wave grating (spatial frequency 1.6 c/deg, contrast 76% Michelson) presented to the other eye, based on Lumer et al. (1998). The grating translated upward or downward in separate trials (1.7 periods/s). This stimulus was presented within a white square (side, 3.5°) on a black background, with white lines extending from the square to the top and bottom edges of the screen to aid eye alignment.
The two apparent motion stimuli were (1) the ambiguously rotating spiral described in the text and (2) a two-frame animation stimulus based on Sterzer and Kleinschmidt (2007). The latter consisted of two continually alternating frames. One frame showed two white dots (76 cd/m; diameter, 0.17°) on a gray background (12 cd/m), one dot on the bottom right and one on the top left corner of an imaginary rectangle (width, 2°; height, 3°) centered on a white fixation cross. The other frame showed two white dots on the two remaining corners. The two frames alternated four times each second, causing apparent motion between the dots on alternate frames, with the dots appearing to jump either horizontally or vertically.
As was the case during the imaging experiments, observers used three keys to report their perceptual experience: one for each exclusive percept and a third to indicate transition states. Moreover, after the experiment, observers were asked to produce a written report describing the variety of percepts they had experienced.
The behavioral experiment outside the scanner confirmed that both our BR stimulus and our AM stimulus produced a mixture of both instantaneous perceptual transitions and transitions that took more time to unfold (31% and 83% of transitions were reported as instantaneous for BR and AM, respectively), and that our stimuli were comparable in this respect to BR and AM stimuli that we copied from existing imaging studies [we recorded 35% and 65% instantaneous transitions for BR and AM stimuli from Lumer et al. (1998) and Sterzer and Kleinschmidt (2007), respectively]. Instantaneous transitions were more commonly reported for the AM stimuli than the BR stimuli (paired t test, p < 0.001), but both instantaneous and prolonged transitions occurred for all four stimuli for each of our observers. To provide more insight into the variety of percepts these stimuli may elicit, Table 1 lists subjective descriptions provided by our observers during debriefing.
For our fMRI experiment using bistable apparent motion (Fig. 1a), we modeled all transitions as 500 ms events occurring at their temporal midpoint and contrasted these events against baseline activation during stimulus presentation, as described previously (Lumer et al., 1998; Sterzer and Kleinschmidt, 2007). We found transition-related activity primarily located in the right hemisphere, including intraparietal sulcus (IPS), superior parietal lobule (SPL), and more anterior parietal areas leading to the right hemisphere temporal-parietal junction (TPJ); and frontal areas comprising the FEF, inferior frontal junction (IFJ), and DLPFC (Fig. 2; Table 2). This pattern of activity is entirely consistent with previous findings (Sterzer and Kleinschmidt, 2007). As a first step toward investigating the relation of this activation with the time course of perceptual transitions, we contrasted transitions that take a longer time to unfold (> median duration) to shorter ones (≤ median duration). This longer > shorter contrast shows activation in a very similar array of cortical areas, with the addition of insula and anterior cingulate cortex (ACC) (Fig. 2b), with no major regions of activation in the original contrast (Fig. 2a) disappearing when we apply this between-duration contrast. This finding indicates that frontal and parietal brain regions whose activity is associated with perceptual transitions during ambiguous motion perception respond more strongly when transitions last longer.
To test whether this result generalizes to a different type of perceptual bistability, we used the same event-related paradigm to measure activations while observers viewed and reported on fluctuations in perception during BR (Fig. 1b). In binocular rivalry, perceptual alternations arise from an entirely different kind of sensory conflict, stemming from an incongruence between the two eyes' images. Figure 2c shows the areas of the brain that respond more strongly to binocular rivalry transitions than to a baseline of mere stimulus presence. Compared with AM (Fig. 2a), we see stronger activation in occipital cortex and insula for this stimulus. It is possible that the differences in occipital cortex are attributable to the distinct sensory stimulation provided by the two stimuli. However, other than those regions, the array of activated areas is very similar to the array of areas that respond to apparent motion transitions, again including TPJ, IPS, FEF, right IFJ (rIFJ), and DLPFC. This result is consistent with previous findings for BR (Lumer et al., 1998). More importantly, a contrast between long and short BR transitions (Fig. 2d) again reveals several of the same areas, including rIFJ, FEF, IPS, and TPJ. Areas missing from this between-duration contrast for BR (Fig. 2d) include DLPFC and insula, even though the analogous contrast for AM did reveal those areas. One possibility is that DLPFC and insula would also appear for BR if statistical power was increased. Importantly, however, the main areas of the frontoparietal network known to respond during transitions in bistable perception, namely rIFJ, FEF, and parietal areas like IPS and TPJ, respond more strongly to long transitions than to short ones for both types of bistable stimuli.
These findings indicate that transition duration is an important determinant of transition-related activation in frontal and parietal brain areas. This is interesting in light of the hypothesis that activity in these regions triggers perceptual transitions (Lumer et al., 1998; Leopold and Logothetis, 1999; Sterzer and Kleinschmidt, 2007; Sterzer et al., 2009). To accommodate our findings within the context of this trigger hypothesis would require revising it to posit that a stronger trigger is required for longer transitions. Alternatively, our results could suggest that frontoparietal activity occurs in response to transitions as they unfold and not only preceding them as implied by the trigger hypothesis. This would parsimoniously explain the enhanced responses when transitions last longer.
In a number of previous brain imaging studies of bistable perception, brain activation measured during bistable perception was compared with brain activation during replay conditions where perceptual fluctuations were simulated by physically replaying them on the screen, based on observers' tracking reports from preceding trials of genuine bistability (Lumer et al., 1998; Tong et al., 1998; Polonsky et al., 2000; Tong and Engel, 2001; Lee et al., 2007; Sterzer and Kleinschmidt, 2007). In those studies focusing specifically on transitions, this comparison between bistability and replay has been treated as a litmus test for identifying activity that is genuinely related to endogenous perceptual transitions, reasoning that transition-related activation observed during replay probably reflects a general response to perceptual change and not a response unique to transitions in bistable perception per se. It is noteworthy, however, that the replay conditions in those studies simulated all perceptual transitions as instantaneous (Lumer et al., 1998; Sterzer and Kleinschmidt, 2007). Could activations observed during genuine bistability but not during replay be due, at least in part, to differences in transition durations between genuine and replay conditions (i.e., longer, more variable during genuine bistability vs always instantaneous during replay)?
To address this possibility, we measured activations associated with two distinct kinds of replay runs in our binocular rivalry experiment. In runs using what we call instantaneous replay, we displayed all perceptual transitions as instantaneous; this procedure was copied directly from one used previously (Lumer et al., 1998), in which one of the images was rapidly cross-faded with the other image to create an essentially instantaneous simulated transition. In separate runs, we used a novel type of replay transition, dubbed duration-matched replay, that more closely mimicked transitions actually experienced during binocular rivalry. The transitional mixtures used in these duration-matched runs were simulated by means of a wave traveling across the stimulus area, replacing one image with the other (a common type of mixture percept) (Wilson et al., 2001). The durations of these simulated transitions were equal to the durations observers had reported during the immediately preceding binocular rivalry presentation period. This second replay condition was more realistic, because it contained the same range of transition durations experienced during genuine binocular rivalry.
The conventional contrast between endogenous perceptual transition events during binocular rivalry and yoked transition events during instantaneous replay (Fig. 3b; Table 3) revealed the same areas previously observed using this contrast for binocular rivalry (Lumer et al., 1998) as well as other bistable stimuli (Sterzer and Kleinschmidt, 2007), namely IPS, TPJ, FEF, IFJ, DLPFC, insula, and ACC. Significantly, however, when we contrasted perceptual transitions during rivalry with yoked transitions in the novel, duration-matched replay condition, we no longer observed any differential activation in frontal-parietal areas, nor elsewhere outside of occipital cortex (Fig. 3c; Table 3).
The absence of activation differences when contrasting rivalry and duration-matched replay cannot be attributed to a lack of statistical power. This contrast was comparable to the contrast using instantaneous replay (which did reveal strong activation) in terms of data acquisition, the number of perceptual transition events analyzed (382 and 414 transitions in the instantaneous and duration-matched replay conditions, respectively), and the analysis method used. Indeed, upon closer inspection, the lack of significant voxels in this contrast does not reflect a poor signal but, on the contrary, the presence of a robust signal during duration-matched replay that was not present during instantaneous replay. This is underscored by an event-triggered analysis in right frontal regions, the trademark area of this frontal-parietal network (Fig. 3d). This analysis shows virtually identical responses to both genuine transitions during perceptual bistability and duration-matched replay transitions (Fig. 3d), explaining the lack of signal in the contrast between the two. This pattern of results is very similar for all further subdivisions of frontal cortex; parietal areas show very similar results (Fig. 4). Our findings therefore indicate that these areas respond highly similarly during genuine rivalry and replay, as long as the durations of transitions are matched between the two conditions. Note that the observation of a reduced response during instantaneous replay in these areas (Fig. 3d, orange curve) is consistent with earlier findings (Lumer et al., 1998; Sterzer and Kleinschmidt, 2007), yet the magnitude of this reduction appears larger here than in those studies. One reason for this quantitative difference might be that for our BR stimulus, which contained continuous motion, the cross-fade at the time of a simulated instantaneous transition constituted only a modest bottom-up transient.
Our findings encourage an expanded view of the role of the right-lateralized frontal-parietal network that has been implicated in perceptual transitions during bistable perception (Lumer et al., 1998; Sterzer and Kleinschmidt, 2007). We show that a large part of this network responds more strongly to transitions of longer duration than to those of shorter duration (Fig. 2b,d) for two very different types of bistable perception. In addition, activation in this network is equivalent to that associated with viewing video animations that mimic bistable perception, as long as the two conditions are matched for transition duration (Fig. 3c,d). Together, these results indicate that the putative role of this frontoparietal network in bistable perception may need to be expanded. Existing accounts posit that responses in this network of frontoparietal areas actively initiate perceptual transitions (Lumer et al., 1998; Sterzer and Kleinschmidt, 2007), a conclusion based largely on explicit comparison of endogenous perceptual transitions with replayed transitions whose durations were not matched to those of the endogenous transitions. For two reasons, our results suggest that this idea may deserve refinement. First, we find that transition duration is a main determinant of frontoparietal activity (Fig. 2). Second, when we match for transition duration, we find equivalent frontoparietal activity during bistable perception and replay, even though replayed transitions are not caused by any brain area, but are initiated externally by the computer (Fig. 3).
If we are correct that frontoparietal activations are not solely related to triggering transitions, to what do we attribute the responses observed in those areas during transitions? There are several known properties of this frontoparietal network that would predict such transition-related responses. First, when perception changes at the time of a transition, the observer's visual experience saliently differs from the stable dominance that is seen the majority of the time, and this may play a role similar to infrequently occurring odd-ball stimuli that activate similar cortical areas in other stimulus paradigms (Corbetta and Shulman, 2002; Stevens et al., 2005; Raemaekers et al., 2009; Asplund et al., 2010). This is true even without a task (Downar et al., 2002), just as passive viewing of binocular rivalry without a task still activates frontal and parietal areas (Lumer and Rees, 1999; Wilcke et al., 2009). Second, transitional periods with mixed perception potentially make it harder for an observer to decide what is being seen, compared with periods of exclusive dominance, thereby transiently increasing task difficulty. Areas in the right inferior frontal cortex are known to increase their activation with increased task difficulty (Binder et al., 2004; Heekeren et al., 2004) and there are strong connections between inferior frontal areas and brainstem regions whose activity regulates arousal and vigilance through the release of norepinephrine (Corbetta et al., 2008). Third, perceptual transitions in binocular rivalry that involve a traveling wave (Wilson et al., 2001) are likely to prompt a redirection of spatial attention across the area of the stimulus. Redirection of spatial attention is known to activate areas in the network that are activated around the time of a rivalry transition, such as IPS and FEF (Corbetta and Shulman, 2002; Yantis et al., 2002; Silver et al., 2005; Kelley et al., 2008; Szczepanski et al., 2010). Fourth, in the case of a prolonged transition, the onset of the transition changes the observer's task from reporting the end of an exclusive dominance period to reporting the beginning of a new one if the perceptual mixture was to be reported separately. Similarly, in paradigms where only the exclusive percepts are to be reported, the onset of a prolonged transition changes the task from waiting for a perceptual change to deciding whether the current mixture should be classified as one or the other percept. Indeed, the areas we find to activate during prolonged perceptual transitions consistently have been found to be involved in task switching (Konishi et al., 2001; Brass and von Cramon, 2004; Dosenbach et al., 2006). In sum, the occurrence of frontoparietal activity during perceptual transitions can plausibly be attributed, at least in part, to changes in sensory experience, attentional state, and task demand that are associated with the occurrence of those transitions. Given this alternative account of frontal activation during transitions, how do we reconcile our findings with the earlier results implying that the major role of frontoparietal areas is to cause perceptual transitions? The next several paragraphs address this question. The idea that frontoparietal areas play a causal role in perceptual transitions is based substantially on the observed enhanced fMRI activity in these areas for endogenous perceptual transitions compared with replayed transitions (Lumer et al., 1998; Sterzer and Kleinschmidt, 2007). It is to those findings that our present results speak most directly. These previous fMRI studies replayed all transitions as instantaneous, yet transition durations during bistable perception in the scanner were not recorded. It is possible, therefore, that differences in transition durations contributed to the signal differences between the two conditions. In general, prolonged transitions are common in bistable perception (Hollins and Hudnell, 1980; Anstis et al., 1985; Yantis and Nakama, 1998; Hol et al., 2003; Brascamp et al., 2006; Klink et al., 2010) and our psychophysical experiment demonstrated prolonged transitions for stimuli designed to be similar to stimuli used in these previous fMRI studies (see Results, above) (Table 1).
Another finding that may point to a causal role of, particularly, the right IFC (rIFC) in perceptual transitions, is that rIFC BOLD activity precedes activity in sensory areas by a longer time period during endogenous transitions than during replayed transitions (Sterzer and Kleinschmidt, 2007). This finding cannot readily be explained by the hypothesis that rIFC merely responds to perceptual transitions and it may indeed indicate a causal role of rIFC. Nevertheless, it is interesting to note that the relative timing difference in BOLD response found in that study, in which transitions were replayed as instantaneous events, amounted to ∼800 ms. In our psychophysical experiment, we observed transitions to last 1.2 ± 0.5 s for the stimulus we copied from that study. This suggests the alternative possibility that different brain areas respond at different moments during perceptual transitions, such as their onset or their offset. This speculative idea could be tested in future work that investigates the fine temporal structure of BOLD responses while also controlling transition durations.
Other relevant findings include the observed effects of frontal brain damage on bistable perception. Frontal lobe damage renders the alternative percepts of a bistable stimulus harder to identify (Ricci and Blundo, 1990; Meenan and Miller, 1994) (but see Valle-Inclán and Gallego, 2006) and impairs top-down volitional control over the perceptual sequence (Windmann et al., 2006). Turning to parietal cortex, the rate at which perception alternates during bistable perception is altered when transcranial magnetic stimulation is used to interfere with parietal function (this has not been replicated for frontal cortex) (Carmel et al., 2010; Kanai et al., 2010; Zaretskaya et al., 2010; de Graaf et al., 2011; Kanai et al., 2011). Moreover, alternation rate is correlated with anatomical characteristics of parietal cortex (but not frontal cortex) (Kanai et al., 2010). Although these findings are broadly consistent with the idea of a trigger of perceptual transitions located in frontoparietal areas, they also fit well with a modulatory role of these areas on the alternation cycle. This latter option would be consistent with psychophysical findings showing that directed attention modulates the frequency of perceptual alternations (Meng and Tong, 2004; van Ee et al., 2005; Chong et al., 2005; Paffen et al., 2006).
Together, this growing literature provides compelling evidence that frontal and parietal areas are involved in bistable perception. In this respect, bistable perception is similar to visual perception in general, which relies on an interaction between primary visual areas and other brain regions, including the frontal and parietal cortex (Crick and Koch, 1995; Driver and Vuilleumier, 2001; Corbetta and Shulman, 2002; Rees, 2007). It is possible that this interaction takes place along the lines of the hybrid view expressed by Sterzer and Kleinschmidt (2007) and then elaborated by Sterzer et al. (2009). According to this view, the initial cause of perceptual alternations may be a destabilization of the currently dominant perceptual representation, perhaps due to neural adaptation in sensory brain areas of the sort implied by many psychophysical studies (Blake et al., 2003; Lankheet, 2006; Brascamp et al., 2006; Pastukhov and Braun, 2007; van Ee, 2009; Alais et al., 2010). Following this destabilization, the idea continues, frontoparietal areas might respond by initiating a reorganization through feedback signals to those same sensory areas, resulting in the formation of a new dominant percept. In this scenario, perceptual alternations would involve a cascade of events, some of them causing frontoparietal areas to activate and others caused by frontoparietal areas themselves.
This work was supported by NIH Grants EY14437 and EY13358, the World Class University program through the National Research Foundation of Korea funded by the Ministry of Education, Science and Technology (R32-10142, to R.B.), a High Potential grant of Utrecht University and a Flemish Methusalem 08/02 grant (to R.v.E.), a National Health and Medical Research Council (Australia) CJ Martin Fellowship 457146 (to J.P.), and a Nederlandse Organisatie voor Wetenschappelijk Onderzoek (NWO) Rubicon grant (to J.B.). T.K. was supported by a NWO VENI grant (451-09-016) and a Chaire d'Excellence grant to Patrick Cavanagh. We thank Jascha Swisher for helpful discussions during the design of the imaging experiments. We thank Phillip Sterzer, Frank Tong, Patrick Cavanagh, René Marois, Ryota Kanai, Erich Graf, Wendy Adams, Tobias Donner, and Jascha Swisher for helpful comments on earlier versions of the manuscript.
The authors declare no competing financial interests.
- Correspondence should be addressed to Tomas Knapen, University of Amsterdam, Brain and Cognition, Room 611, Roeterstraat 15, 1018WB Amsterdam, The Netherlands.
This article is freely available online through the J Neurosci Open Choice option.