Abstract
In this functional magnetic resonance imaging study we tested whether the predictability of stimuli affects responses in primary visual cortex (V1). The results of this study indicate that visual stimuli evoke smaller responses in V1 when their onset or motion direction can be predicted from the dynamics of surrounding illusory motion. We conclude from this finding that the human brain anticipates forthcoming sensory input that allows predictable visual stimuli to be processed with less neural activation at early stages of cortical processing.
Introduction
Vision can be regarded as a continuous cascade of neural reactions to the light that enters our eyes. Several theoretical models, however, elaborate this view of vision by claiming that the brain is not merely reactive but also “proactive” or “predictive” (Mumford, 1992; Rao and Ballard, 1999; Erlhagen, 2003; Bar, 2007; Enns and Lleras 2008; Bar, 2009; Friston and Kriebel 2009; Grossberg, 2009). By predictive we refer to the idea that the brain generates predictions that estimate the visual input it will most likely receive given the contextual information from the recent past. In their theoretical model, Rao and Ballard (1999) have put forward the idea that such predictions play a central role in vision. They propose that visual cortices learn statistical regularities of the natural world and only signal the unpredictable components of their sensory input to higher visual areas. As a result, predictable stimuli are thought to require less neural activation to be conveyed from lower to higher visual cortices.
The aim of this functional magnetic resonance imaging (fMRI) study is to test whether predictability of stimuli reduces responses in the human visual cortex as proposed in the above model of predictive coding. If this model holds, then we expect predictability to reduce visual responses in primary visual cortex (V1), which is the earliest stage of visual processing in the human cerebral cortex. To test this hypothesis, we assessed whether visual stimuli induce smaller V1 responses when their onset or motion direction can be predicted from the trajectory of surrounding illusory motion. In addition to assessing blood oxygenation level-dependent (BOLD) responses in V1, we also measured responses in the human visual motion area hMT/V5+ because of its known involvement in the processing of dynamic visual stimuli (Zeki et al., 1991; Tootell et al., 1995).
Materials and Methods
Subject fMRI.
Twelve healthy subjects (six male, six female) with normal or corrected-to-normal vision participated in the first fMRI experiment, and five (four male, one female) participated in the second. All subjects gave their informed consent after being introduced to the experimental procedure in accordance with the Declaration of Helsinki.
Stimuli and task fMRI experiment 1.
Stimuli were generated using Presentation software (version 10.3; Neurobehavioral Systems) and presented at a 60 Hz refresh rate using a projector (Sanyo Pro xtraX PLC-XP41 multiverse projector) with a zoom lens projecting from an adjacent room through a wave guide. Subjects viewed the stimuli through a tilted adjustable mirror (inside the head coil) on a screen that was attached to the back of the head coil. The viewable screen size subtended 33.7 × 26.6° of visual angle.
Subjects were instructed to fixate on a central fixation cross throughout the entire experiment, during which bars were consecutively presented 9.0° above and below the fixation cross and with a horizontal offset of 9.2° to the right (Fig. 1a). The screen had a gray background color (luminance: 28.3 cd\m2), and the bars (height: 1.7°; width: 4.7°; luminance: 139.0 cd\m2) induced the impression of upward and downward long-range apparent motion (Exner, 1875; Wertheimer, 1912; Kolers, 1963; Beck et al., 1977; Ekroll et al., 2008) with a full-cycle frequency of 1.43 Hz. Each bar was presented for 200 ms with an interstimulus interval (ISI) between bar presentations of 150 ms. Critically, for predictable and unpredictable trials we briefly (16.7 ms) presented a test bar 5.0° above the lower bar during each of the ISIs in which upward apparent motion was perceived. During predictable trials, we presented this stimulus during the third frame after the offset of the lower bar stimulus, which corresponds to a presentation delay of 41.7 ms (assuming that the actual presentation occurred halfway through the third frame). This timing was chosen because it is exactly 2.5/9 of the ISI, which corresponds to the ratio of the distance between the lower and the test stimulus to the total length of the apparent-motion path (5.0/18°). Therefore, this stimulus is positioned and timed exactly on the motion trajectory of linear apparent motion between the lower and upper bar stimuli. For unpredictable trials, the test bar was presented at the same position during the seventh frame after the offset of the lower bar, corresponding to a 108 ms delay. This caused the test bar to appear at a time at which linear apparent motion had already passed the position of the test bar in unpredictable trials (Fig. 1b shows a schematic overview). The third trial contained the apparent-motion stimuli but no test bar and served to assess a baseline signal. Note here that this baseline included all other stimulus components except the test bar. Therefore, deconvolved BOLD responses for predictable and unpredictable trials can only be attributed to the presentation of the test bar and not to the presentation of the upper and lower bars because responses to these stimuli were subtracted from the baseline. All three types of trials lasted for 7 s, during which 10 apparent-motion cycles were presented. Subjects were presented with 81 trials of each type distributed over three runs of fMRI measurements. We employed a rapid-event-related design and, to ensure a correct deconvolution of the BOLD responses, used a sequence for the presentation of the trials within each run that was pseudorandomized such that it ensured a two-back balanced trial history (Alink et al., 2008).
To localize the cortical representation of the test bar stimulus in V1 and hMT/V5+, we presented inverting black-and-white checkerboards (spatial frequency: 1.2 cycles\degree; inversion frequency: 16 Hz; luminance white: 139.0 cd\m2; luminance black: 2.1 cd\m2) with the same location and extent as the test bar and the lower bar on a black background (2.1 cd\m2). These stimuli were presented in blocks of 16 s with 16 s fixation intervals that served as baseline. Throughout the entire run subjects fixated on a central white (139.0 cd\m2) fixation cross identical to the one in the main experiment.
Stimuli and task fMRI experiment 2.
During the second experiment we presented the same apparent-motion stimuli as during experiment 1. However, these were presented using a magnetic resonance-compatible goggle system with two organic light-emitting diode displays (MR Vision 2000; Resonance Technology) that resulted in an 11% decrease in stimulus width and a 15% decrease in stimulus height due to the different screen size of the goggle system (30.0 × 22.5°). The luminance of the gray background on this screen was 23.8 cd\m2, and the apparent-motion stimuli and fixation cross had a luminance of 44.0 cd\m2. During the entire experiment, subjects were instructed to maintain fixation on the fixation cross.
During the 150 ms ISI between the apparent-motion stimuli, we presented 150 dots (size = 0.1°; luminance = 44.0 cd\m2) randomly placed in an area with a width of 4.2° and a height of 8.25° centered on the apparent-motion path (Fig. 1c). During these 150 ms ISIs, the dots moved with a velocity of 9° per second in four different directions: parallel to apparent-motion direction or 30, 60, or 90° anti-clockwise from the apparent-motion direction. These random-dot motion configurations are referred to as 0, 30, 60, and 90° angle offset, respectively. Random-dot motion was presented during both upward and downward apparent motion in opposite directions such that the 0, 30, and 60° angle offset stimuli moved upwards during upward apparent motion and visa versa. The 90° angle offset condition contained no vertical motion component and moved leftwards during upward apparent motion and rightwards during downward apparent motion. Moving dots exiting the motion area reappeared at the opposite side of the motion area. As in experiment 1, we used apparent motion without stimulation on the apparent-motion trace to assess the baseline signal, and we presented the stimulus conditions in trials containing 10 apparent-motion cycles, each lasting 7 s. In total, each angle offset trial type was presented 40 times and baseline trials were presented 120 times to all subjects, divided over four runs in a randomized order. Again, as in experiment 1 we used inverting checkerboards to localize the cortical representation in V1 of the area in which the random dots were presented.
fMRI procedure experiment 1.
Functional and anatomical MRI data were acquired with a 3T-MRI system (Siemens Trio) using a standard computed tomography head coil. During the presentation of the apparent-motion stimuli, we obtained three runs of 588 volumes containing 17 slices covering the occipital lobe as well as inferior parietal, inferior frontal, and superior temporal regions for each subject using an echo planar imaging (EPI) sequence [repetition time (TR), 1000 ms; echo time (TE), 30 ms; flip angle, 62°; voxel size, 3.4 × 3.4 × 3.0 mm; field of view (FOV), 220 mm; gap thickness, 0.3 mm]. Checkerboard stimuli were presented in a separate run during which 638 volumes were acquired using identical scanning parameters. All EPI images were corrected for spatial distortions using a point spread function sequence (Zaitsev et al., 2004). For each subject we also obtained a high-resolution T1-weighted anatomical image using a Siemens MPRAGE sequence (1 × 1 × 1 mm). For six of the subjects we also performed standard polar-angle retinotopic mapping using the same parameters employed routinely in our laboratory (Weigelt et al., 2007; Muckli et al., 2009). Furthermore, we measured eye movements during the fMRI experiment for 11 subjects by using an infrared camera system placed outside the scanner room that measured the position of the right eye's pupil and cornea reflex at a rate of 60 Hz through a mirror system (Applied Science Laboratories).
fMRI procedure experiment 2.
Functional and anatomical MRI data were acquired with a 3T-MRI system (Siemens Allegra) using a four-channel head coil. During the presentation of the apparent-motion stimuli, we obtained four runs of 700 volumes containing 18 slices covering the occipital lobe as well as inferior parietal, inferior frontal, and superior temporal regions for each subject using an EPI sequence (TR, 1000 ms; TE, 30 ms; flip angle, 77°; voxel size, 3.3 × 3.3 × 3.5 mm; FOV, 210 mm; gap thickness, 0.35 mm). Checkerboard stimuli were presented in a separate run during which 484 volumes were acquired using identical scanning parameters. All EPI images were corrected for spatial distortions using a point spread function (Zaitsev et al., 2004).
Analysis of fMRI experiment 1.
Functional as well as anatomical MRI data were analyzed using the Brainvoyager QX software package (Brain Innovation). The first four volumes of the functional runs were discarded to preclude T1 saturation effects. After preprocessing (motion correction, linear trend removal, temporal high-pass filtering at 0.01 Hz, and slice-scan-time correction), functional data for all subjects were aligned with the individual high-resolution anatomical MPRAGE image and transformed into Talairach space (Talairach and Tournoux, 1988). After manual correction for inhomogeneities, we created an inflated cortex reconstruction for all 12 subjects. For the six subjects for whom a polar angle map was acquired, we defined the V1–V2 borders on this cortex reconstruction as shown in supplemental Figure 1, available at www.jneurosci.org as supplemental material. Regions of interest (ROIs) for the cortical representation of the location of the test bar stimulus were defined individually in V1 as well as in hMT/V5+. The ROI in V1 consisted of the 500 mm3 of cortex within the calcarine sulcus that responded most strongly when a checkerboard was presented at the location of the test bar but showed no response when a checkerboard was presented at the lower bar location. The t threshold that defined the minimum t value of this area was different for each subject (see Results). This area was clearly within the borders of V1 for all six subjects for whom we mapped the V1–V2 border. This finding is in line with studies on human retinotopic organization of human primary visual cortex (Vanni et al., 2005) showing that stimulation both on and close to the horizontal meridian elicits activation within close proximity of the calcarine sulcus, which, when the eccentricity of the stimulus is sufficient, can easily be separated from parallel activation in V2. Because the six mapped subjects' data quality was sufficiently high to reproduce these stereotypical findings, we assumed that the ROIs defined in the calcarine sulcus for the other six subjects without defined V1–V2 borders should also be sound. For 10 of the 12 subjects we were also able to select 200 mm3 of cortex within V1 that was responsive to checkerboards presented at the lower bar location but not to checkerboards presented at the target location (for three exemplary subjects see supplemental Fig. 2a, available at www.jneurosci.org as supplemental material).
It has been shown that checkerboards with frequencies close to the one used here elicit BOLD responses in hMT/V5+ (Tootell et al. 1995). This was also apparent in our data, which allowed us to individually define ROIs consisting of 500 mm3 of cortex in hMT/V5+ that was activated by the checkerboard presented at the test bar location. Defining these ROIs allowed us to evaluate whether predictability in the context of apparent motion affected BOLD responses within the cortical representation of the test and lower bar in V1 and the representation of the test bar in hMT/V5+. This was tested on a group level by pooling the individually defined data from the main experiment originating from the ROIs in V1 and hMT/V5+ of all 12 subjects for the test bar ROIs and 10 subjects for the lower-bar ROIs. From the pooled data, we computed a general linear model (GLM) for all ROIs using a deconvolution design (Glover, 1999) and tested whether the β values for time points 4–12, which correspond to the peak of the BOLD response (4–12 s poststimulus), were significantly different for predictable compared with unpredictable trials. p values were corrected for multiple comparisons using Bonferroni's correction for the number of ROIs in which we compared BOLD responses between conditions. For ROIs that showed an effect, we assessed whether this effect was consistent across subjects by computing the direction of the difference for each subject individually and testing whether more subjects showed an effect in one direction than in the other as expected by chance using a sign test.
In addition to the ROI analysis, we also performed a group analysis over the entire brain volume to see whether we could find regions other than V1 and hMT/V5+ in which visual responses are affected by the predictability of the test stimulus. To this end we smoothed the functional data of each subject with a Gaussian kernel (8 mm full-width at half-maximum) and computed a GLM over the smoothed data across subjects. The effect of predictability was assessed by contrasting β values across conditions for the time points 4–12 in conjunction with contrasts that tested whether both types of stimuli induced a significant signal increase compared with baseline.
Analysis of fMRI experiment 2.
The responses in V1 and hMT/V5+ to the different types of random-dot motion were analyzed using the same ROI approach as that employed in experiment 1. For each subject we defined a ROI for V1 defined as a volume of 500 mm3 close to the calcarine sulcus that responds to a checkerboard stimulus presented at the location of the random-dot motion area. For hMT/V5+ we defined the ROI as a volume of 500 mm3 close to the posterior part of the inferior temporal sulcus that responded to all moving-dot configurations. Based on the group ROI data, we calculated an average BOLD response for V1 and hMT/V5+ for each of the angle offsets. To test whether our hypothesis that predictability of the random-dot motion direction reduces visual responses was correct, we tested whether the most predictable motion type (angle offset 0°) induced a significantly lower BOLD response than that of the least predictable motion type (angle offset 90°). To this end, we tested whether β values for time points 4–12 were significantly lower for the 0° angle offset condition. Furthermore, we tested whether angle offset linearly increases visual responses in V1 and hMT/V5+ by assessing the Pearson correlation between the mean β value from time points 4–12 and the angle offset.
Analysis of eye movements.
For 11 subjects we calculated the mean and standard deviation of the horizontal and vertical position of fixation for the predictable and the unpredictable conditions over all data points that were outside a ±200 ms interval of eye blinks (time points at which the pupil diameter was zero). We tested whether there were differences in mean and variance across conditions using a repeated-measures test over subjects. Furthermore, we created a density plot of eye position for both conditions using all eye-tracking data across all subjects (supplemental Fig. 3, available at www.jneurosci.org as supplemental material).
Results
fMRI experiment 1
We defined cortical ROIs within V1 and hMT/V5+ representing the position and extent of the predictable and unpredictable stimuli for all 12 subjects using individualized t thresholds [mean (SD) for t thresholds in V1 = 5.15 (3.12) and in hMT/V5+ = 3.39 (1.99); mean (SD) of Talairach coordinates for V1: x = −6.7 (2.9), y = −85.0 (4.2), z = −2.1 (2.9), and for hMT/V5+: x = −41.0 (5.2), y = −75.0 (5.2), z = 1.6 (5.3); for details see supplemental Fig. 1, available at www.jneurosci.org as supplemental material]. From the data within these ROIs pooled across all 12 subjects, we computed deconvolved BOLD responses for the predictable and unpredictable stimuli in V1 and hMT/V5+ (Fig. 2a,c). Within these ROIs, we analyzed BOLD responses to stimuli for which the onset could or could not be predicted from the trajectory of apparent motion. These stimuli are referred to as predictable or unpredictable stimuli, respectively, and were identical in all aspects except for the onset relative to the apparent-motion trajectory (Fig. 1). We found that predictable stimuli gave rise to a significantly lower BOLD response in V1 than unpredictable stimuli (p < 0.0066, Bonferroni corrected for the number of ROIs), while there appeared to be no effect of predictability within area hMT/V5+ (p > 0.05, Bonferroni corrected). Individual responses in V1 turned out to be reduced for predictable stimuli for 10 of 12 subjects (sign test: p < 0.05), while only half of the subjects showed this effect in hMT/V5+, as expected by chance (sign test: p > 0.05). Thus, our results indicate that the predictability of the onset of a stimulus presented on the apparent-motion path reduces responses in V1 while not affecting hMT/V5+ responses.
To test whether the effect of predictability in V1 was retinotopically specific, we also analyzed BOLD responses of ROIs in V1 of 10 subjects that represent the position and extent of the lower apparent motion-inducing stimulus [mean (SD) for t thresholds: 5.06 (2.69); mean (SD) of Talairach coordinates: x = −2 (3.0), y = −83.0 (3.6), z = −1.2 (3.4)]. In this region, we observed no differences between BOLD responses to predictable and unpredictable stimuli. Thus, the effect that we observe in V1 for predictability is retinotopically specific to the V1 representation of the test stimulus presented on the apparent-motion path (supplemental Fig. 2, available at www.jneurosci.org as supplemental material).
We also performed a group analysis over the entire brain volume measured in experiment 1 to assess whether other regions besides V1 show an effect of stimulus predictability. This analysis did not identify any region that was significantly affected by stimulus predictability (p > 0.05, corrected using false discovery rate). Supplemental Figure 5, available at www.jneurosci.org as supplemental material, shows a statistical map (p < 0.05, uncorrected) for this group analysis and demonstrates that the only activation clusters showing an effect of stimulus predictability (albeit not significant after the correction for multiple comparisons) are those inside or nearby the individual ROI volumes for V1.
To ensure that our effects did not result from differential fixation performance across conditions, we measured eye movements of our subjects inside the scanner. Differences between mean horizontal and vertical position of fixation over all subjects differed by <0.1° of visual angle between predictable and unpredictable trials (p > 0.05, repeated-measures ANOVA). Also, the SDs for both dimensions did not differ across trial types (p > 0.05, repeated-measures ANOVA). Density plots of eye position show no gross differences in the distribution of fixation accuracy in space across conditions (supplemental Fig. 3, available at www.jneurosci.org as supplemental material).
fMRI experiment 2
As in experiment 1, we analyzed BOLD responses in individual ROIs for V1 and hMT/V5+ [mean (SD) for t thresholds in V1 = 6.2 (2.0), and in hMT/V5+ = 9.0 (3.2); mean (SD) of Talairach coordinates for V1: x = −2.0 (5.4), y = −81.4 (4.4), z = 1.4 (4.4), and for hMT/V5+: x = −41.6 (1.9), y = −68.8 (5.2), z = 7.2 (3.3)]. To test whether predictable motion in the context of apparent motion induces lower visual responses, we assessed in V1 and hMT/V5+ whether responses to the most predictable motion-angle offset of 0° were lower compared with responses to the least predictable angle offset of 90°. Indeed, both these areas exhibited a lower response when the random dots moved parallel to the apparent-motion direction (0°) compared with responses to orthogonal motion (90°) (V1: p < 0.0005; hMT/V5+: p < 0.0005, Bonferroni corrected for the number of ROIs). For both areas, we also tested whether there was a positive correlation between the angle offset and the visual response amplitudes. This turned out to be the case for both areas, although the correlation in V1 did not reach significance (V1: p = 0.08; hMT/V5+: p < 0.02) (Fig. 3 shows more details).
Psychophysical control experiment
Previously we have shown that low-contrast stimuli that are predictable in the context of apparent motion are more readily detected (Schwiedrzik et al., 2007). To test whether this is also the case for the high-contrast stimuli used in experiment 1, we performed a control experiment which contained both high- and low-contrast target stimuli. These stimuli were presented during upward as well as downward apparent motion at two different positions along the apparent-motion path. The results of this experiment replicated our previous findings (Schwiedrzik et al., 2007). Thus, stimuli that are predictable in the context of apparent motion were detected more often than unpredictable stimuli (mean detection rate predictable = 38%, mean detection rate unpredictable = 32%, p < 0.03; repeated-measures ANOVA, two-sided test). Neither stimulus contrast, apparent-motion direction, nor target position was found to interact with this effect (for details on the experimental procedure see the supplemental methods, available at www.jneurosci.org as supplemental material). From the current experiment, however, we cannot tell whether the difference in detection rates is due to a difference in d′ or due to a response bias because our paradigm did not allow us to assess the correct rejections or the false-alarm rate. Our previous experiment (Schwiedrzik et al., 2007), however, indicated that the effect of predictability on detection rates was not due to a criterion shift. As the stimuli employed in the current psychophysical experiment are almost identical to those employed in our previous experiment, it is unlikely that the elevated detection rates reported here are due to a criterion shift.
The mean reaction time for predictable stimuli was 513 ms and for unpredictable stimuli 521 ms. This small difference in reaction time between these stimulus categories was, however, not significant (p > 0.05). Reaction times were also not affected by stimulus contrast, apparent-motion direction or target position.
Discussion
In this fMRI study we investigated whether predictable stimuli evoke smaller responses in V1 as implied by predictive-coding models (Mumford, 1992; Rao and Ballard, 1999). To this end, we measured BOLD responses in V1 to stimuli whose onset or motion direction could either be predicted or not predicted from their spatiotemporal context. Hence, we tested whether activation in a mapped region in V1 was modulated by illusory motion induced by stimuli presented well outside the classical receptive field of this V1 region. Furthermore, we assessed whether stimulus predictability affected activation levels in the human visual motion area hMT/V5+.
The results of both experiments are in line with our hypothesis that stimulus predictability reduces activation levels in V1. The outcome of experiment 1 indicates that stimuli with a predictable onset give rise to lower V1 responses than identical stimuli presented with a less predictable onset. The second experiment shows that responses in V1 and hMT/V5+ are lowest when the direction of random-dot motion is predicted by the direction of apparent motion and that visual responses in these areas increase as the direction is made less predictable.
Our findings are in line with several other studies that have observed lower V1 responses for stimuli that fit their visual context. V1 has been shown to respond less to coherent than to incoherent motion (McKeefry et al., 1997; Harrison et al., 2007; Bartels et al., 2008) and less to grouped than to randomly arranged objects (Murray et al., 2002). Furthermore, face-selective areas in ventral visual cortex have been shown to respond less when a face stimulus is repeated in a continuous trajectory (Yi et al., 2008), and responses of neurons in the superior temporal sulcus of the monkey brain were shown to be suppressed and to occur at shorter latencies when stimulation consists of predictable sequences of natural images (Perrett et al. 2009). However, our study is the first to show that subtle changes in the spatiotemporal predictability of a stimulus affect stimulus processing in V1. Hence, in experiment 1 we show that V1 processes stimuli with less activation when their onset is predictable, even though luminance, size, position, and duration of stimuli were kept constant. Experiment 1 also shows that this effect of predictability is constrained to the retinotopic representation of the test stimulus. Furthermore, we demonstrate in experiment 2 that responses in V1 decrease when the predictability of visual stimuli is parametrically increased.
Another important implication of experiment 1 is that lower responses in V1 can co-occur with higher detection rates. Although we did not measure behavioral responses inside the scanner during experiment 1, the results of our previous study (Schwiedrzik et al., 2007), taken together with the results of our psychophysical control experiment, imply that the high-contrast test stimuli used during this experiment should have been more detectable when they were predictable in the context of apparent motion. Thus, the present study implies that a predictable stimulus that is more detectable can induce a smaller BOLD response in V1 than an unpredictable and less detectable stimulus.
In experiment 1, we observed that the predictability of stimulus onset reduced V1 responses but that no similar effect was present in hMT/V5+. In experiment 2, however, both of these areas were found to exhibit reduced responses when random-dot motion was more predictable in the context of apparent motion. One could conclude from these results that hMT/V5+ is affected by the predictability of motion direction but not by the predictability of stimulus onset. However, it is also possible that hMT/V5+ is sensitive to both of these features, but we did not have a large enough signal-to-noise ratio to demonstrate this due to the low amplitude of this region's responses to static stimuli.
Given the results presented here, what can be said about the mechanisms that allow V1 to process predictable stimuli with less activation? According to the model of Rao and Ballard (1999), this would require feedback from higher-level visual areas specifying which stimulus input is likely to arrive in V1 given the current spatiotemporal context. Feedback from higher-level visual areas to V1 seems a likely explanation for the effects of stimulus predictability reported here as these areas have larger receptive fields than V1, allowing them to determine the trajectory of long-range apparent motion (Angelucci and Bullier, 2003, Angelucci and Bressloff, 2006, Ichida et al., 2007). This fact, taken together with the observation that during long-range apparent motion hMT/V5+ sends feedback signals to V1 (Muckli et al., 2005; Sterzer et al., 2006; Ahmed et al., 2008; Wibral et al., 2008), can be considered a strong indication that activation in hMT/V5+ drives the predictability effect in V1. However, several studies have suggested that local processing of feedforward signals in V1 allows for more sophisticated neural computations than one would expect from classical receptive field models (Seriès et al., 2002, 2003; Masland and Martin, 2007). Due to the low temporal resolution of fMRI, we could not assess whether activation in hMT/V5+ precedes and drives the predictability effects in V1. Therefore, it still remains to be determined whether reduced responses in V1 to predictable stimuli result from feedback, local processing in V1 or, what is likely to be the case, an interaction between feedback and local processing in V1 (Erlhagen, 2003).
Yi et al. (2008) observed that ventral visual cortex responds less to continuously than to discontinuously moving objects and attributed this effect to subjects perceiving continuously moving objects more as a single entity or gestalt. One could argue that the results presented here are due to a similar mechanism by claiming that predictable stimulus ensembles had a greater integrity as a gestalt. Such an interpretation does not, however, stand in opposition to the predictive-coding model of Rao and Ballard (1999). They propose that predictions are based on statistical regularities of the natural world that can be argued to be the basis of gestalt principles (Brunswik and Kamiya, 1953; Elder and Goldberg, 2002). It is also worth mentioning in this context that the extraclassical receptive-field effects explained in the model of Rao and Ballard (1999) all relate to reduced neural responses to stimuli that form a gestalt with their spatial surround based on collinearity.
Another explanation for a higher BOLD response in V1 to unpredictable stimuli could be that these types of stimuli induce greater pop-out. It could be that the unpredictable stimuli stood out more than the predictable stimuli due to their higher incompatibility with the surrounding apparent-motion stimuli. Such an attentional explanation would be in line with the finding of greater neural responses in V1 to stimuli that induce a stronger pop-out effect in macaques (Smith et al., 2007). However, as we have shown previously (Schwiedrzik et al., 2007) and replicated in the psychophysical control experiment, detection rates are lower for unpredictable flashes, which speaks against this attentional interpretation. Hence, if the unpredictable flash pops out more than the predictable flash, then it should also be detected more readily (Treisman, 1982). Furthermore, if the higher V1 response in experiment 1 was driven by attention, then, based on previous studies (Treue and Maunsell, 1996; Beauchamp et al., 1997; Büchel et al., 1998), one would expect that such motion-related attentional modulation would be even stronger in hMT/V5+, which is not compatible with our findings.
To summarize, in this study we show that the predictability of visual stimuli reduces neural responses in V1 and hMT/V5+. This finding provides strong empirical evidence for the idea that the visual cortex actively anticipates its visual input and that such anticipation allows predictable stimuli to be processed with less neural activation at the earliest cortical relay for visual processing. Furthermore, our results imply that predictable stimuli can be detected more readily than unpredictable stimuli, although unpredictable stimuli evoke greater V1 responses.
Footnotes
-
This work was supported by the Biotechnology and Biological Sciences Research Council (United Kingdom; BB/G005044/1) as well as by the Federal Ministry of Education and Research (Germany; BMBF 01 GO 0508, DFG MU 2358/1-2). We thank Felix Euler, Sandra Anti, Fraser W. Smith, Gerrit Maus, Lucia Melloni, Brigitte Rockstroh, Laura Wilkie, Petra Vetter, Lucy S Petro, and Florian Beißner.
- Correspondence should be addressed to Arjen Alink, Department of Neurophysiology, Max Planck Institute for Brain Research, Deutschordenstraße 46, D-60528 Frankfurt am Main, Germany. alink{at}mpih-frankfurt.mpg.de