Playing an unfamiliar sport initially feels chaotic and reactive. But as it becomes familiar, players learn to anticipate each other's movements and the game falls into neat order. The ability to predict upcoming visual information based on past experience underlies many of the complex behaviors humans can perform: predictive processing biases our perception of ambiguous stimuli toward probable interpretations (Hansen et al., 2006) and quickens stimulus detection (Stokes et al., 2012; Cravo et al., 2017). One central question in neuroscience is how the brain instantiates memory-based predictions when faced with complex visual input.
One framework for considering the neural basis for memory-based predictions is predictive coding, which proposes that memory-related brain areas exert top-down influence on lower-level sensory areas to bias sensory processing toward probable stimuli (Lee and Mumford, 2003; Albright, 2012; Bastos et al., 2012). Recent work using fMRI and electrophysiology support this view. Early visual cortex represents the typical color of an object (i.e., banana as yellow) even when it is presented in grayscale (Bannert and Bartels, 2013), and neurons in macaque MT represent learned motion associated with static shapes (Schlack and Albright, 2007). Meanwhile, predictions made from learned co-occurrence of stimuli recruit the hippocampus. For instance, one study showed that after participants learned to associate a visual stimulus with a specific sound, the hippocampus represented the predicted visual stimulus in response to its associated auditory cue (Kok and Turk-Browne, 2018). In related work, after participants learned to associate pairs of abstract fractals, the hippocampus represented the action–outcome association (i.e., stimulus A + action X yields stimulus B), while visual cortex represented the identity of the upcoming stimulus predicted from the learned association (Hindy et al., 2016).
Two important questions emerge from this work. First, although the literature above and other work (Kok et al., 2020; Aitken and Kok, 2022) show that the brain uses learned associations to predict upcoming stimuli (i.e., first-order associations), many phenomena unfold over a sequence of events. Do predictive processes in the visual system operate over longer sequences of stimuli (i.e., secondary and higher-order associations)? Second, past work suggests that hippocampus and visual cortex contain representations of upcoming visual stimuli, but the relationship between these representations is not clear. Do these predictions come about concurrently, or is the prediction in visual cortex dependent on the hippocampal representation?
In a recent article in The Journal of Neuroscience, Clarke et al. (2022) address how predictions occur in the brain during perception of complex stimulus sequences. Participants learned the layouts of two “zoos” that contained the same set of animal images. Animals appeared one at a time in a sequence, and participants advanced through the sequence by indicating a direction (up/down/left/right) based on the learned layout of the current zoo. The zoos had identical sequence relationships between neighboring animals but moving through the same sequence required different actions. For example, both zoos contained the sequence: “horse-rabbit-tiger,” but moving from a horse to a rabbit required an “up” response in one zoo and “left” in the other. In the fMRI scanner, participants saw a cue depicting a starting animal and a goal animal, and indicated the action sequence required to move between them. Critically, one-quarter of the trials were “catch” trials, which ended before participants reached the goal animal. This enabled the authors to isolate neural activity produced during stimulus prediction from neural activity produced during stimulus perception, because catch trials displayed no visual content after ending early.
Clarke et al. (2022) focused their analyses on regions likely to represent the current stimulus [including primary visual cortex (V1)/secondary visual cortex (V2)], as well as regions that would likely harbor predictive representations of upcoming stimuli (i.e., hippocampus and posterior medial cortical network). Multivoxel pattern similarity analysis (i.e., the correlation in the pattern of activation across voxels within a region of interest; Haxby et al., 2001) was used to identify representations of stimuli at the item (single animal), sequence (ordered set of animals), and prediction (upcoming single animal) level.
What brain regions contained predictive representations? Clarke et al. (2022) reasoned that predictions would occur in the same regions that represent the current stimulus (i.e., item-level representation). Using pattern similarity analysis, they found that V1/V2, posterior medial cortex, and posterior hippocampus represented the current stimulus. Next, using data from the catch trials, the authors asked whether the same regions represented the upcoming, predicted stimulus by looking for a representation of the stimulus that should have appeared next in the sequence (but was omitted). Interestingly, V1/V2 and posteromedial cortex (PMC) represented the predicted (but omitted) stimulus, but posterior hippocampus did not. The finding that V1/V2 and PMC represented the predicted stimulus aligned with the authors' expectations, since these regions also contained item-level representations. However, the absence of prediction in posterior hippocampus was somewhat surprising given that it contained item-level representations and previous reports suggested that hippocampus represented predicted visual stimuli (Kok and Turk-Browne, 2018; Kok et al., 2020).
What might explain the failure to detect a representation of the predicted stimulus in the posterior hippocampus? Past studies investigating prediction have used first-order associations between pairs of stimuli, and stimuli only occurred within their specific pair. In contrast, Clarke et al. (2022) used two contexts (i.e., zoos), which encouraged an abstract representation of the sequence regardless of context. Indeed, Clarke et al. (2022) found that rather than representing the predicted stimulus, the posterior hippocampus represented the current sequence, regardless of zoo. This suggests that posterior hippocampus represents high-level information about the overarching event structure, rather than predictions, per se.
Discovering a distinction between the representation of the sequence and that of the predicted stimulus led to a second question: do stimulus predictions in cortical regions arise from hippocampal sequence representations? Clarke et al. (2022) reasoned that the sequence representation in posterior hippocampus could provide the context needed for lower-level regions to represent individual predicted stimuli. To test this hypothesis, the authors used a representational connectivity analysis to investigate the temporal relationship between the sequence and predictive representations. Specifically, they compared the strength of the sequence representation in posterior hippocampus at a key decision point before a stimulus was omitted with the strength of the representation of the omitted (predicted) stimulus in V1/V2 and PMC. Consistent with their hypothesis, the strength of the sequence representation in posterior hippocampus was correlated with the strength of subsequent prediction representations in V1/V2 and PMC. This result suggests that cortical predictions may depend on contextual representation in hippocampus.
Together, the findings of Clarke et al. (2022) significantly extend our understanding of how predictive representations are instantiated in the brain and demonstrate a clear distinction between the contribution of hippocampus and cortex to predictive processing. Specifically, in complex environments, hippocampus represents the context for the prediction, while cortex represents the prediction, per se. Moreover, the authors' representational connectivity analysis suggests a systems-level mechanism for predictive coding in the brain, whereby the representation of context in the hippocampus drives prediction representation in cortex.
Translating the predictive mechanism proposed by Clarke et al. (2022) to real-world contexts raises several intriguing questions. Notably, we experience the visual world as a relatively continuous stream of information during wakefulness. Given this continuity, what time window does hippocampus integrate over to represent event sequences? One possibility is that a sequence representation integrates over an extended time frame by strategically sampling past information (Shankar and Howard, 2013). The hippocampus might also integrate over many timescales simultaneously. Although Clarke et al. (2022) find sequence representation only in posterior hippocampus, recent evidence suggests that hippocampus could represent information at multiple timescales, organized from short-to-long timescales along the posterior-to-anterior hippocampal axis (Brunec and Momennejad, 2022). Future work could use sequences of stimuli with variable relevant time windows to explore the interaction between sequence representation and prediction.
A second open question concerns how sequence representations are transformed into prediction representations. During perception, information is transmitted through the visual hierarchy from V1 to the hippocampus via the ventral stream (Tanaka, 1996). Does generating a prediction representation from memory engage the whole visual system, or are predictions instantiated directly in V1 from hippocampus? Several recent studies suggest that intermediate regions could be involved. For example, activity propagates backward through the visual hierarchy during explicit memory recall (Favila et al., 2019; Breedlove et al., 2020; Dijkstra et al., 2020). In addition, recent evidence using immersive scenes has shown that scene-selective visual areas not only represent the current field of view, but also memory of the broader visuospatial context (Robertson et al., 2016; Berens et al., 2021). For example, the retrosplenial complex (Maguire et al., 1998; Bar and Aminoff, 2003) and occipital place area (Malach et al., 2002; Dilks et al., 2013) represent two scene views more similarly when the viewer knows their overlapping spatial context (Robertson et al., 2016). These results suggest that these areas play a role in integrating perceptual and mnemonic information, but whether these regions are involved in predictive processing is not clear. Ultimately, it will be important to consider the role of intermediate and high-level visual areas—in addition to hippocampus and early visual cortex—to understand how memory-based predictions are instantiated in the brain (Steel et al., 2021).
In conclusion, Clarke et al. (2022) significantly enhances our understanding of both the content of predictive representations during goal-oriented behavior and the temporal dynamics of those representations. Future studies should continue to investigate the neural mechanisms underpinning predictive processing and should work to elucidate what information is contained in prediction representations in cortex and how they are distinct from sequence representations in the hippocampus. Addressing these points will improve our understanding of how memory-based predictions support efficient perception as we move through complex, real-world events.
Footnotes
Editor's Note: These short reviews of recent JNeurosci articles, written exclusively by students or postdoctoral fellows, summarize the important findings of the paper and provide additional insight and commentary. If the authors of the highlighted article have written a response to the Journal Club, the response can be found by viewing the Journal Club at www.jneurosci.org. For more information on the format, review process, and purpose of Journal Club articles, please see http://jneurosci.org/content/jneurosci-journal-club.
A.S. is funded by the Neukom Center for Computational Science.
The authors declare no competing financial interests.
- Correspondence should be addressed to Adam Steel at adam.steel{at}dartmouth.edu