Abstract
Hierarchical predictive coding networks are a general model of sensory processing in the brain. Under neural delays, these networks have been suggested to naturally generate oscillatory activity in approximately the α frequency range (∼8-12 Hz). This suggests that α oscillations, a prominent feature of EEG recordings, may be a spectral “fingerprint” of predictive sensory processing. Here, we probed this possibility by investigating whether oscillations over the visual cortex predictively encode visual information. Specifically, we examined whether their power carries information about the position of a moving stimulus, in a temporally predictive fashion. In two experiments (N = 32, 18 female; N = 34, 17 female), participants viewed an apparent-motion stimulus moving along a circular path while EEG was recorded. To investigate the encoding of stimulus-position information, we developed a method of deriving probabilistic spatial maps from oscillatory power estimates. With this method, we demonstrate that it is possible to reconstruct the trajectory of a moving stimulus from α/low-β oscillations, tracking its position even across unexpected motion reversals. We also show that future position representations are activated in the absence of direct visual input, demonstrating that temporally predictive mechanisms manifest in α/β band oscillations. In a second experiment, we replicate these findings and show that the encoding of information in this range is not driven by visual entrainment. By demonstrating that occipital α/β oscillations carry stimulus-related information, in a temporally predictive fashion, we provide empirical evidence of these rhythms as a spectral “fingerprint” of hierarchical predictive processing in the human visual system.
SIGNIFICANCE STATEMENT “Hierarchical predictive coding” is a general model of sensory information processing in the brain. When in silico predictive coding models are constrained by neural transmission delays, their activity naturally oscillates in roughly the α range (∼8-12 Hz). Using time-resolved EEG decoding, we show that neural rhythms in this approximate range (α/low-β) over the human visual cortex predictively encode the position of a moving stimulus. From the amplitude of these oscillations, we are able to reconstruct the stimulus' trajectory, revealing signatures of temporally predictive processing. This provides direct neural evidence linking occipital α/β rhythms to predictive visual processing, supporting the emerging view of such oscillations as a potential spectral “fingerprint” of hierarchical predictive processing in the human visual system.
- alpha oscillations
- low-beta oscillations
- motion processing
- neural transmission delays
- power decoding
- predictive coding
Introduction
“Predictive coding” is a general model of the hierarchical inference process underlying visual processing (Rao and Ballard, 1999). The functional architecture of the visual system implied by predictive coding is that of a hierarchical network of interconnected neural populations. The higher levels of this network attempt to predict the activity of lower levels, with the residuals of these predictions being passed back upwards.
In the predictive coding literature, the fact that neural signaling takes time has often been overlooked (but see Friston, 2008; Hogendoorn and Burkitt, 2019). Consideration of this fact places important constraints on predictive coding models, in that predictions and residuals can never be transmitted instantaneously, but must rather pass between levels with some delay. Recent theoretical work has suggested that, when biologically plausible signaling delays are built into hierarchical predictive coding networks, the recursive network dynamics naturally generate oscillatory activity in approximately the α frequency range (∼8-12 Hz, with the precise frequency depending on the signaling delay and neural time constant) (Alamia and VanRullen, 2019). This is important because it suggests that oscillations in this general frequency range may (in some cases) be a signature of predictive sensory processing, arising from rhythmic “message passing” between hierarchically organized neural populations. If this is true, one might expect features of these rhythms, such as their power (squared amplitude), to carry information about the underlying stimulus being processed. However, this has yet to be directly tested. The primary aim of this study was therefore to examine whether the power of α oscillations over the occipital cortex carries stimulus-related information.
One complication that arises when incorporating neural delays into a hierarchical predictive coding framework is that, for time-varying input, backward predictions will always conflict with sensory input if neural delays are not accounted for. To effectively minimize prediction error, information processing must not only be hierarchically predictive, but also temporally predictive. That is, extrapolation mechanisms are needed that adjust forwards and backwards signals and correct for the lag incurred during signal transmission (Hogendoorn and Burkitt, 2019). Consequently, if α oscillations are a signature of predictive coding, the information they carry should display temporally predictive/anticipatory qualities. When prior expectations about the stimulus can be generated, these rhythms should carry information about expected input, even in the absence of feedforward signals. While there is mounting evidence that neural activity patterns during visual processing do carry predictive information (e.g., Kok et al., 2014, 2017; Blom et al., 2020; Liu et al., 2021), the spectral locus of such information has typically not been investigated.
In the present study, we examined whether and how information about the position of a predictably moving stimulus manifests in oscillations over the occipital cortex. In two experiments (N = 32, 34), participants viewed an apparent motion stimulus (i.e., a series of spatially and temporally separated flashes that generate the percept of coherent motion) traveling along a circular path while EEG was recorded. In these sequences, the stimulus' trajectory was predictable, meaning its future position could be anticipated, although the end of each sequence was unexpected. Importantly, in a previously published analysis of the dataset from Experiment 1, we demonstrated that predictions about the upcoming stimulus position were evident in the EEG signal (Blom et al., 2020). Here, we use a complementary analysis strategy to investigate whether predictive representations manifest in specific oscillatory frequency bands. To do so, we develop a method for constructing probabilistic spatial maps from oscillatory power estimates. With this method, we demonstrate that the location of the stimulus can be decoded from occipital oscillations in the α/low-β range (peak information at ∼12 Hz). We also observe anticipatory activation of neighboring unstimulated position representations at the end of motion sequences, suggesting that the processes underlying predictive spatial preactivation manifest in α/β band oscillations. In a second experiment, we replicate and extend these findings, ruling out the possibility that the encoding of information in this range is driven by visual entrainment.
Materials and Methods
Experiment 1
This experiment includes data collected using two slightly different protocols. Separate investigation of this dataset has been previously reported (Blom et al., 2020, 2021).
Participants
Twelve observers (6 female, mean age 25 years) participated under the first protocol, and 20 observers (12 female, mean age 23 years) participated under the second protocol. All had normal or corrected-to-normal vision. Both protocols were approved by the human research ethics committee of the University of Melbourne (Ethics ID 1954628), Australia. All observers gave written informed consent before participating and were reimbursed AUD15 per hour.
Procedure
The stimulus was a black, truncated wedge presented on a uniform 50% gray background. The stimulus could appear in 1 of 8 equally spaced locations around a white central fixation point, at 22.5°, 67.5°, 112.5°, 157.5°, 202.5°, 247.5°, 292.5°, and 337.5° of polar angle from the vertical (Fig. 1). Inner and outer edges of the wedge were 6.3° and 7.7° of visual angle away from fixation, respectively. The wedge covered 11° of polar angle, with 1.3° of visual angle at the inner and 1.5° of visual angle at the outer edge. The stimulus was presented for 66 ms, with an interstimulus interval (ISI) of 33 ms and an intertrial interval of 400 ms between sequences. Stimuli were presented on an ASUS ROG PG258 monitor with a resolution of 1920 × 1080 running at 120 Hz. The monitor was controlled by an HP EliteDesk 800 G3 TWR PC running MATLAB R2017b with PsychToolbox 3.0.14. Participants viewed the stimuli from a headrest at a distance of 60 cm.
Task
Participants viewed an apparent motion stimulus moving along a circular trajectory, while EEG was recorded. After moving for between 3 and 44 flash repetitions (300 ms and 4.4 s), the stimulus either disappeared or reversed its direction (Fig. 1). Participants were tasked with making a button press whenever the stimulus was colored red instead of black. This occurred 32 times per block under Protocol 1 and 50 times per block under Protocol 2. The task was designed to keep participants engaged with the stimulus, and behavioral data were not analyzed. Under Protocol 1, trials with targets were discarded, and target trials were shown again at the end of each block. Under Protocol 2, trials with targets were simply discarded.
Experimental design
Under Protocol 1, participants completed six blocks of sequences across three testing sessions. Under Protocol 2, participants completed two blocks across two testing sessions.
Under Protocol 1, each block contained the following types of trials, randomly interleaved as follows:
Sequences with 1, 2, or 3 consecutive presentations starting at each position and moving in both directions were presented 10 times (3 sequence lengths × 8 starting positions × 2 directions × 10 repetitions = 480 trials).
Sequences with 4, 5, 6, 7, or 8 consecutive presentations starting at each position and moving in both directions were presented twice (5 sequence lengths × 8 starting positions × 2 directions × 2 repetitions = 160 trials).
Sequences with 16, 20, 24, 28, 32, 36, 40, or 44 consecutive presentations starting at each position and moving in both directions were presented once (8 sequence lengths × 8 starting positions × 2 directions = 128 trials).
Sequences with 16, 20, 24, 28, 32, 36, 40, or 44 consecutive presentations starting at each position and moving in both directions followed by a reversal and continuation in the opposite direction for 8 to 16 (randomly determined) additional presentations were presented once (8 sequence lengths × 8 starting positions × 2 directions = 128 trials).
Because 32 target trials were appended to the trial list, each block encompassed 928 trials (in 16 sets of 58 trials). Each set was initiated with a button press. Each participant completed two blocks per session, with a block lasting ∼30 min. In total, each participant completed 5568 trials.
Under Protocol 2, all types of trials were combined in a single block, randomly interleaved as follows:
Sequences with 4, 5, 6, 7, or 8 consecutive presentations starting at each position and moving in both directions were presented 8 times (5 sequence lengths × 8 starting positions × 2 directions × 8 repetitions = 640 trials).
Sequences with 9, 10, 11, 12, 13, 14, 15, or 16 consecutive presentations starting at each position and moving in both directions were presented 4 times (8 sequence lengths × 8 starting positions × 2 directions × 4 repetitions = 512 trials).
Sequences with 9, 10, 11, 12, 13, 14, 15, or 16 consecutive presentations starting at each position and moving in both directions followed by a reversal and continuation in the opposite direction for 1 to 8 (randomly determined) additional presentations were presented 4 times (8 sequence lengths × 8 starting positions × 2 directions × 4 repetitions = 512 trials).
In each block, a target was randomly presented in 50 trials, and these trials were discarded. Each block was split up into 13 sets, and each set was initiated with a button press. In a session, participants completed one block, taking ∼90 min. In total, each participant completed 3328 trials.
EEG acquisition and preprocessing
The 64-channel EEG data, and data from six EOG and two mastoid channels, were acquired using a BioSemi ActiveTwo EEG system sampling at 2048 Hz. EEG data were rereferenced offline to the average of the two mastoid electrodes and resampled to 512 Hz. Eleven participants had one bad channel during one of the sessions. This channel was spherically interpolated using EEGlab (Delorme and Makeig, 2004).
All data were epoched relative to stimulus onset. For the decoding analysis, we make a distinction between training and test epochs. Training epochs (−150 to 150 ms) were used to train temporally specific linear discriminant analysis classifiers. Under both protocols, the training epochs were time-locked to the first presentation in a sequence. The initial stimulus was random and had no history, meaning its position could not be anticipated. The training data were initially epoched from −800 ms before stimulus onset to 800 ms after and baseline-corrected to the mean of the 200 ms period before stimulus onset. Reduced epochs (−150 to 150 ms) were then extracted and concatenated before time-frequency decomposition.
Test epochs were extracted relative to the onset of 4 events of interest (Start, Middle, Stop, and Reversal). Initial epochs were again taken from −800 to 800 ms and were baseline-corrected to the mean of the 800 ms period before stimulus onset. This baseline period was chosen such that it was consistent across all epochs and contained a full cycle of motion on the majority of the epochs, to avoid introducing stimulus-specific differences as much as possible. Reduced epochs (−400 to 800 ms) were then extracted and concatenated before time-frequency decomposition. Training and testing epochs in which the amplitudes across any of the 8 occipital electrodes exceeded 100 μV were rejected. Across all observers, 11.70% (SD = 6.98%) of epochs were removed in this way.
Time-frequency decomposition and power-based decoding analysis
To focus on EEG activity recorded over the visual cortex, our analyses were restricted to the 8 occipital electrodes (PO7, PO3, O1, POz, Oz, O2, PO4, PO8). To construct the training set, we extracted epochs between −150 and 150 ms after the onset of the first stimulus in the apparent motion sequences. To construct the testing set, we extracted epochs between −400 and 800 ms relative to the 4 events of interest (Start, Middle, Stop, and Reversal).
We decoded from time point-specific normalized power estimates, to avoid potential issues with baselining (Hajonides et al., 2021). To extract these power estimates, time-frequency decomposition was performed using custom MATLAB code. The EEG time series was convolved with a set of complex Morlet wavelets, defined as Gaussian-windowed complex sine waves: ei2πtfe− t2/(2*σ2), where t is time, f is frequency (which increased from 2 to 40 Hz in 20 linearly spaced steps; although for consistency, the third extracted frequency was set to 6.67 Hz to align with the slowest stimulus presentation rate in Experiment 2), and σ defines the width of each frequency band, defined as n/2πf, with n logarithmically increasing from 3 to 10. From the resulting analytic signal (z) we obtained power estimates defined as p(t) = |z(t)|2.
To investigate the spectral locus of stimulus-position information, we trained linear discriminant analysis classifiers at each training time point (50 and 150 ms) to predict the position of the initial stimulus from frequency-specific normalized power estimates. Across testing time points (−400 to 800 ms), we then took individual trials and computed the posterior probabilities associated with the stimulus being in each of the 8 possible positions. Averaging across testing trials, this yielded 8 values indicating the probability the stimulus was in a given position, for a given training representation. To temporally generalize this measure, we averaged the probabilities from each of the temporally specific classifiers (50-150 ms). This yielded a time-generalized measure of the relative probability that a stimulus was in each of the possible locations, at a given testing time point (i.e., a probabilistic map). Temporal generalization was necessary to allow for the fact that the timing of sensory processing likely changes when stimuli are predictable (Blom et al., 2020). Finally, we reordered the resulting probability values to center the location of the presented stimulus at t = 0, flipping one motion direction condition to align the probability estimates. Averaging across stimulus positions and motion directions, this yielded frequency-specific maps of the stimulus position over time.
To examine the time courses of position information encoding, and to test for evidence of temporal prediction, we extracted the position evidence time course for the location one step ahead of the position the stimulus occupied at t = 0 (i.e., one position forwards along its original trajectory of motion). This allowed us to see whether future (expected) position representations were activated when the stimulus unexpectedly reversed direction or disappeared (i.e., in the absence of direct visual input). To assess the frequency specificity of stimulus-position information encoding, we convolved a cosine function with each frequency-specific spatial tuning function (i.e., probabilistic maps constructed from the power of individual frequencies). Averaging across time (50-150 ms), this yielded a single estimate of the strength of the stimulus-position information encoded at each frequency.
Statistical analysis
We adopted a nonparametric approach to analyzing the position evidence time courses (see Figs. 3B, 4B). Specifically, we estimated a one-sided bias-corrected and adjusted bootstrapped CI around the mean (10,000 bootstrapped samples, α levels of 0.05 and 0.01). Time points where this interval exceeded 0 were taken as being significantly different from chance.
Experiment 2
To control for the underlying rhythmicity of the stimulus, we ran a second experiment in which the update-rate of the stimulus was varied across three frequencies: 100 ms (10 Hz), 125 ms (8 Hz), and 150 ms (6.67 Hz). Unless otherwise stated, the procedure used in Experiment 2 was identical to Experiment 1.
Participants
Thirty-four observers (17 female, mean age 25 years) participated in Experiment 2. All observers had normal or corrected-to-normal vision. The experimental protocol was approved by the human research ethics committee of the University of Melbourne (Ethics ID 1954628), Australia. All observers gave written informed consent before participating and were reimbursed AUD15 per hour.
Procedure
Participants completed three separate blocks of apparent motion sequences in a random order. In each block, the ISI was varied (Block Type 1: 33.33 ms ISI, 100 ms update rate; Block Type 2: 58.33 ms ISI, 125 ms update rate; Block Type 3: 83.33 ms ISI, 150 ms update rate), while the stimulus presentation time (66.66 ms) was held constant. Each block consisted of sequences of 4-12 consecutive presentations starting at each position and moving in both directions, presented 6 times (9 sequence lengths × 8 starting positions × 2 directions × 6 repetitions = 864 trials).
EEG acquisition and preprocessing
All acquisition and screening procedures were identical to Experiment 1. Thirteen participants had one bad channel during one of the sessions. These channels were spherically interpolated using EEGlab (Delorme and Makeig, 2004). Across all participants, 14% (SD = 9.13%) of epochs were removed for exceeding the 100 μV limit. Identical time-frequency decomposition, decoding, and statistical analysis methods to those used in Experiment 1 were used to analyze the data from Experiment 2.
Data and code availability
Code and data for recreating all analyses will be made available on the open science framework at the time of publication: https://osf.io/x8n9p/.
Results
Figure 2B shows probabilistic stimulus-position maps derived from the power of occipital oscillations in the α/low-β range (10-16 Hz), split by motion direction and event of interest (Start, Middle, Stop, Reversal, Fig. 2A).
Figure 3A shows the same data, collapsed across motion direction. The probabilistic maps presented in Figures 2 and 3 reveal that occipital α/low-β oscillations contain position information, with the high-probability region (shown in red) consistently tracking the location of the stimulus, even when the stimulus unexpectedly reverses direction. This demonstrates that it is possible to reconstruct the stimulus' trajectory from the power of these oscillations alone.
To examine whether position information was encoded in a temporally predictive fashion, we examined the evidence time course for the position one step ahead of the position the stimulus occupied at t = 0 (i.e., one position forwards along its original trajectory of motion, Fig. 3B). In line with our previous work using more conventional classification analysis applied to raw EEG amplitudes (Blom et al., 2020), these time courses reveal that, when the stimulus stopped or reversed, there was anticipatory activation of the next expected position representation at the time of expected presentation.
To assess the frequency specificity of stimulus-position information encoding, we convolved a cosine function with each frequency-specific spatial tuning function (i.e., probabilistic maps constructed from the power of individual frequencies) between 50 and 150 ms (Hajonides et al., 2021). Averaging across time, this yielded a single estimate of the strength of the stimulus-position information encoded at each frequency. Consistent with the theoretical work of Alamia and VanRullen (2019), this analysis revealed that stimulus-position information was strongly encoded in the α range (Fig. 3C). Interestingly, peak encoding occurred at roughly the border of the canonical α and β ranges (12 Hz), with clear information encoding extending into the low-β range (∼12-20 Hz). This potentially suggests that the relevant time-delay for visual processing is slightly shorter than Alamia and VanRullen (2019) originally assumed (see Discussion). We note that this entire pattern of results also holds after first subtracting the condition-specific ERPs from the data, suggesting non–phase-locked power effects, and not simply VEP amplitude differences, are driving decoding (results not shown, see online data).
One interpretation of the results from Experiment 1 is that occipital α/low-β oscillations are a spectral signature of ongoing recursive signaling between hierarchically organized regions of the visual system, with their power carrying (spatial) information about the underlying stimulus being processed. However, because the apparent motion stimulus in this experiment was updated every 100 ms (i.e., at 10 Hz), it is possible that stimulus-related information was entrained in the α range by the underlying rhythmicity of stimulus-evoked activity.
To examine this possibility, we ran a second experiment in which we varied the stimulus update rate across three frequencies: 10 Hz (100 ms), 8 Hz (125 ms), and 6.67 Hz (150 ms). A new group of participants (N = 34) viewed an otherwise identical apparent motion stimulus moving along a circular path, while EEG was recorded. Figure 4A shows stimulus-position maps derived from the power of occipital α oscillations in Experiment 2. Different rows illustrate different stimulus update rates (10, 8, or 6.67 Hz), for the Start (left column), Middle (middle column), and End of the sequence (right column). There were no motion reversals in this experiment.
The results of Experiment 2 replicated those of Experiment 1. The location of the stimulus could again be tracked from the power of occipital α/low-β oscillations (10-16 Hz), across variations in stimulus update rate. Similarly, we again saw an increase in probability for the next expected position when the stimulus disappeared (Fig. 4B).
Figure 4C shows that stimulus-position information was again strongly encoded in the α/low-β range, across variations in stimulus update-rate. Even after extended exposure to driven input at lower frequencies (i.e., in the Middle and Stop epochs), there is minimal effect on information encoding within the α/β range. However, at lower frequencies, there is slightly more (albeit inconsistent) variability across update rates. For the 6.67 and 8 Hz conditions, there is qualitatively stronger information encoding, although this does not perfectly scale with the stimulation frequency (see Discussion).
Finally, to examine the time course and spectral locus of predictive position information in greater detail, we collapsed the data from all Stop epochs across both experiments. Focusing on the +1 ahead position (i.e., Fig. 3A, boxed region), we calculated the median probability time course across participants for all stop conditions (Experiment 1 has one condition, Experiment 2 has three conditions for the three speeds, respectively). Averaging across these yields a frequency by time image of probability values. Examining this, we can see that, up until the presentation of the last stimulus (at 0 ms), there is a clear suppression of positional probability for the +1 ahead position (shown in black), occurring in the α/low-β range. This is unsurprising given the fact that the stimulus is presented in other locations during this time period. Around the expected time of stimulus presentation, however, we see evidence of a switch to above chance decoding (shown in white), although no stimulus is actually presented. Figure 5B shows the time course of this effect within the α/low-β range (10-16 Hz). Importantly, Figure 5C shows that this predictive effect is specific to the α/low-β frequency range.
Discussion
Across two experiments, we investigated whether stimulus-related (spatial) information is encoded in the power (squared amplitude) of neural oscillations over the occipital cortex. We also examined whether information is encoded in a temporally predictive fashion, as is required for predictive coding networks to effectively minimize prediction error under neural delays (Hogendoorn and Burkitt, 2019).
In Experiment 1, we demonstrated that the location of a moving stimulus could be decoded from the power of occipital oscillations in the α/low-β frequency range (with peak encoding at ∼12 Hz). Strikingly, we found it was possible to track the position of the moving stimulus and reconstruct its trajectory from the power of these rhythms alone. We also observed anticipatory activation of the expected but unstimulated stimulus position following the end of a motion sequence. This demonstrates that the previously reported preactivation revealed by analysis of raw EEG amplitudes (Blom et al., 2020) is likely encoded in α/low-β band activity. In Experiment 2, by varying the update-rate of the stimulus, we demonstrated that the encoding of information in this frequency range is not driven by visual entrainment.
This study contributes to an emerging line of research examining potential links between the in silico oscillatory dynamics of hierarchical predictive coding networks and rhythmic activity patterns in human EEG recordings (Alamia and VanRullen, 2019; Alamia et al., 2020). To our knowledge, this study is the first to demonstrate that the power of occipital oscillations in the α/low-β range carries predictive stimulus-related information. This finding is broadly consistent with the theoretical predictions of Alamia and VanRullen (2019). Interestingly, we found that peak information occurred at the border of the canonical α/low-β frequency ranges (12 Hz in Experiment 1). This potentially suggests that the relevant inter-regional delay for visual processing may be shorter than originally assumed (i.e., <12 ms), leading to a higher-frequency macroscopic spectral signature (Alamia and VanRullen, 2019). From the results of Experiment 2, there was some evidence that, qualitatively speaking, the strength of information encoding at lower frequencies depended on the stimulus-update rate. One interpretation of this overall pattern of results is that there is both a stimulus-independent oscillatory signature in the α/low-β range, which emerges because of the inherently rhythmic dynamics of hierarchical predictive coding under signaling delays (Alamia and VanRullen, 2019; Hogendoorn and Burkitt, 2019), and an additional stimulus-dependent component, which emerges when the stimulation frequency sufficiently deviates from this range. Ultimately, however, further studies in which the stimulus update rate is varied across a wider range are needed to fully address this possibility.
To briefly account for the observed effects, we think that stimulus onset may have generated waves of activity originating from retinotopically specific locations in visual cortex. Indeed, previous theoretical and empirical work has established the existence of such waves, and has shown that they manifest as oscillations in the α frequency range (Alamia and VanRullen, 2019; Lozano-Soldevilla and VanRullen, 2019; Alamia et al., 2020). Crucially, given their spatially specific nature, these waves of activity will have been registered on the electrode level as relative power differences. By examining normalized oscillatory power across electrodes, it was therefore possible to determine the onscreen position of the stimulus.
While stimulus-position information was encoded most strongly in the α/low-β range, it should not be concluded that spatial information is exclusively encoded at this (relatively slow) temporal scale. Indeed, it is likely that spatial information is also processed on a more fine-grained timescale. In the present study, we may not have been able to observe this because of the relatively poor resolution of scalp-based EEG recordings for high-frequency oscillations. Overall, the fact that stimulus-position information was predominantly encoded in the α/low-β range aligns with the view that oscillations in this general frequency range may be a macroscopic signature of predictive message passing between hierarchically organized regions of the visual system (Alamia and VanRullen, 2019). While such signaling almost certainly operates over many temporal and spatial scales, neural delays potentially cause these oscillations to be the most prominent macroscopic rhythmic “fingerprint” of network activity.
In further support of this view, the probabilistic spatial maps we constructed displayed temporally predictive activation patterns. Specifically, when the stimulus disappeared or unexpectedly reversed its direction, we observed an increase in the probability the stimulus was occupying the next position along its original trajectory of motion, at the expected time of presentation. One could argue that these anticipatory activation patterns may be because of spatial smearing or variability in decoding; however, we have observed similar dynamics in our previous work using more conventional classification analysis applied to raw EEG amplitudes (Blom et al., 2020). Moreover, similar anticipatory patterns have also been directly observed in numerous animal neurophysiology studies (Berry et al., 1999; Jancke et al., 2004; Trenholm et al., 2013; Chemla et al., 2019; Benvenuti et al., 2020; Liu et al., 2021) as well as in more recent human fMRI experiments (Ekman et al., 2017, 2023). The novelty of our current work therefore lies in the demonstration that these anticipatory spatial representations manifest in α/low-β oscillations, consistent with recent computational predictions (Alamia and VanRullen, 2019).
Considering potential neural mechanisms underlying these dynamics, there are two main possibilities. First, it is possible that anticipatory activation is facilitated by an omni-directional spreading of activity between retinotopically organized neural populations. This could be facilitated by within-region lateral connectivity (Benvenuti et al., 2020; Liu et al., 2021) or between-region divergent connectivity (Baldo and Caticha, 2005). A second possibility is that more complex sequence learning mechanisms are involved. For example, it has recently been shown that after repeated exposure to visual sequences, activity in the visual cortex associated with these sequences can be predictively activated (“pre-played”) by the presentation of just a single stimulus (Ekman et al., 2017, 2023). It is possible that, in our experiments, similar predictive associations between neighboring stimulus position were generated, leading to anticipatory activation (although why pre-play of an ongoing sequence did not occur would need to be accounted for). To arbitrate between these possibilities, future studies could examine the dynamics that arise when participants are exposed to arbitrary, noncontiguous sequences of flashes (as in Ekman et al., 2023), using the decoding approach developed in the current study.
While considering the question of temporal prediction, one point that should be made is that, despite showing temporally predictive qualities (activation of likely future positions in the absence of direct input), the bulk of activity in the spatial probability maps still lagged behind the stimulus (although activity onset did align with stimulus onset). This raises the question of whether there was sufficient temporal prediction to fully compensate for neural signaling delays (Hogendoorn and Burkitt, 2019). Ultimately, given the temporal smearing inherent to time-frequency-based analyses, it is beyond the scope of this paper to fully address that question. This may be better tackled using alternative methods in which more fine-grained temporal resolution can be achieved (see Blom et al., 2020; Johnson et al., 2023).
In appraising the current findings, it is important to consider two potential non–stimulus-driven sources of information that may have influenced our decoding analyses: (1) eye movements and (2) spatial attention differences. Eye movements are an insidious potential artifact in neuroimaging experiments that must be considered when using classification analyses (Quax et al., 2019). However, three factors greatly limit the possibility that eye movements confounded the current analyses. First, in earlier analyses of the data from Experiment 1 (Blom et al., 2020), we demonstrated in a control sample of participants that the position of the stimulus could not be decoded from eye-movement traces. Second, the current analyses were restricted to only occipital electrodes. Since eye-movement-related muscle activity manifests predominantly at frontal electrodes, the likelihood that we are picking up on eye movements is further reduced. Finally, training epochs were limited to the first 150 ms after the (unpredictable) onset of a motion sequence. Since saccade onsets and corresponding eye-movement-related biases in decoding performance typically occur >200 ms after stimulus onset (Quax et al., 2019), this further reduces the likelihood of eye-movement confounds. By restricting our analyses to an early time window, and only analyzing activity recorded directly over the visual cortex, it is more likely that the current analyses are tapping into the initial feedforward sweep of visual information processing, rather than eye-movement-related information.
It is also important to consider whether our decoding analyses were confounded by position-related differences in spatial attention. This is because it has been demonstrated that position-specific differences in covert spatial attention can be decoded from the power of α oscillations (Foster et al., 2017). However, the theoretical work of Alamia and VanRullen (2019) potentially prompts a subtle but significant reinterpretation of this earlier finding. While Foster et al. (2017) clearly showed that shifts in attention co-occur with changes in alpha power, this does not mean that α oscillations necessarily directly reflect the deployment of spatial attention. Rather, top-down shifts in attention (occurring at ∼300 ms in Foster et al., 2017) likely alter neural activity patterns in the visual system. Under the account of Alamia and VanRullen (2019), this would, in turn, change the pattern/amplitude of occipital α oscillations. In that sense, α oscillations would reflect the knock-on effect that spatial attention differences have on macroscopic network dynamics, rather than the deployment of spatial attention directly. Considering the present results, the fact that our training epochs were restricted to 50-150 ms after initial stimulus onset again makes it more likely that we are tapping into the first sweep of visual information processing rather than spatial-attention differences, which one might expect to manifest over a slower timescale.
In conclusion, consistent with recent in silico simulations (Alamia and VanRullen, 2019), we have shown that occipital α/low-β oscillations carry predictive stimulus-related information. By examining the power of these rhythms, we could reconstruct the trajectory of a moving stimulus, tracking its position even across unexpected motion reversals. Moreover, we found that future position representations were anticipatorily activated in the absence of direct visual input, indicative of temporally predictive processing. Collectively, these results support the view of α/low-β oscillations as a potential spectral “fingerprint” of hierarchical predictive processing in the human visual system.
Footnotes
This work was supported by Australian Research Council Grants FT200100246 and DP180102268 to H.H. We thank Jane Yook, Milan Andrejević, and Vinay Mepani for help with data collection; and Philippa Johnson for helpful discussions.
The authors declare no competing financial interests.
- Correspondence should be addressed to William Turner at w6.turner{at}qut.edu.au