Skip to main content

Main menu

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Collections
    • Podcast
  • ALERTS
  • FOR AUTHORS
    • Information for Authors
    • Fees
    • Journal Clubs
    • eLetters
    • Submit
    • Special Collections
  • EDITORIAL BOARD
    • Editorial Board
    • ECR Advisory Board
    • Journal Staff
  • ABOUT
    • Overview
    • Advertise
    • For the Media
    • Rights and Permissions
    • Privacy Policy
    • Feedback
    • Accessibility
  • SUBSCRIBE

User menu

  • Log out
  • Log in
  • My Cart

Search

  • Advanced search
Journal of Neuroscience
  • Log out
  • Log in
  • My Cart
Journal of Neuroscience

Advanced Search

Submit a Manuscript
  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Collections
    • Podcast
  • ALERTS
  • FOR AUTHORS
    • Information for Authors
    • Fees
    • Journal Clubs
    • eLetters
    • Submit
    • Special Collections
  • EDITORIAL BOARD
    • Editorial Board
    • ECR Advisory Board
    • Journal Staff
  • ABOUT
    • Overview
    • Advertise
    • For the Media
    • Rights and Permissions
    • Privacy Policy
    • Feedback
    • Accessibility
  • SUBSCRIBE
PreviousNext
Research Articles, Behavioral/Cognitive

Conceptual Associations Generate Sensory Predictions

Chuyao Yan, Floris P. de Lange and David Richter
Journal of Neuroscience 17 May 2023, 43 (20) 3733-3742; https://doi.org/10.1523/JNEUROSCI.1874-22.2023
Chuyao Yan
1School of Psychology, Nanjing Normal University, Nanjing 210097, China
2Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, 6500 HB Nijmegen, The Netherlands
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Floris P. de Lange
2Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, 6500 HB Nijmegen, The Netherlands
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Floris P. de Lange
David Richter
2Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, 6500 HB Nijmegen, The Netherlands
3Department of Experimental and Applied Psychology, Vrije Universiteit, 1081BT Amsterdam, The Netherlands
4Institute Brain and Behavior Amsterdam, 1081BT Amsterdam, The Netherlands
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Article
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF
Loading

Abstract

A crucial ability of the human brain is to learn and exploit probabilistic associations between stimuli to facilitate perception and behavior by predicting future events. Although studies have shown how perceptual relationships are used to predict sensory inputs, relational knowledge is often between concepts rather than percepts (e.g., we learned to associate cats with dogs, rather than specific images of cats and dogs). Here, we asked if and how sensory responses to visual input may be modulated by predictions derived from conceptual associations. To this end we exposed participants of both sexes to arbitrary word–word pairs (e.g., car–dog) repeatedly, creating an expectation of the second word, conditional on the occurrence of the first. In a subsequent session, we exposed participants to novel word–picture pairs, while measuring fMRI BOLD responses. All word–picture pairs were equally likely, but half of the pairs conformed to the previously formed conceptual (word–word) associations, whereas the other half violated this association. Results showed suppressed sensory responses throughout the ventral visual stream, including early visual cortex, to pictures that corresponded to the previously expected words compared with unexpected words. This suggests that the learned conceptual associations were used to generate sensory predictions that modulated processing of the picture stimuli. Moreover, these modulations were tuning specific, selectively suppressing neural populations tuned toward the expected input. Combined, our results suggest that recently acquired conceptual priors are generalized across domains and used by the sensory brain to generate category-specific predictions, facilitating processing of expected visual input.

SIGNIFICANCE STATEMENT Perceptual predictions play a crucial role in facilitating perception and the integration of sensory information. However, little is known about whether and how the brain uses more abstract, conceptual priors to form sensory predictions. In our preregistered study, we show that priors derived from recently acquired arbitrary conceptual associations result in category-specific predictions that modulate perceptual processing throughout the ventral visual hierarchy, including early visual cortex. These results suggest that the predictive brain uses prior knowledge across various domains to modulate perception, thereby extending our understanding of the extensive role predictions play in perception.

  • conceptual associations
  • expectation suppression
  • perception
  • predictive processing

Introduction

The brain is apt at exploiting statistical regularities in the world to improve information processing and optimize behavior (Hunt and Aslin, 2001; Turk-Browne et al., 2005; Goujon and Fagot, 2013). Specifically, the brain may generate predictions about future and current inputs based on previous experience. Indeed, it has been argued that the brain implements a form of probabilistic inference (Bar, 2007; Clark, 2013; de Lange et al., 2018) combining predictions with incoming sensory information to form the most likely interpretation of the world. Accordingly, neural activity in sensory areas, following statistical learning, appears to reflect the interplay of predictions and sensory inputs. For example, studies demonstrated attenuated neural responses to expected compared with unexpected stimuli in the ventral visual stream (Meyer and Olson, 2011; Kok et al., 2012; Richter et al., 2018; for review, see de Lange et al., 2018), often labeled “expectation suppression” and resembling signatures of prediction errors as proposed by predictive processing theories (Rao and Ballard, 1999; Friston, 2005).

However, little is known about whether and how predictions derived from conceptual (semantic) associations modulate sensory processing. In most prior studies (Meyer and Olson, 2011; Kok et al., 2012; Richter et al., 2018), associations were learned and tested for the same visual stimuli, thus allowing for specific exemplars and low-level features to be predicted. Yet, our prior knowledge about the sensory world extends beyond such simple regularities. We know, for instance, what cakes are, making them highly surprising inside shoe cabinets, without the need for exposure to a specific cake before experiencing surprise to find it in an unusual place. However, currently it is unclear whether the resulting surprise arises during perceptual processing or only at a postperceptual level. Indeed, it is possible that such surprisal arises only after sensory processing is concluded, potentially reflected by upregulations in neural responses in anterior insula or inferior frontal gyrus, shown to accompany surprising input (Richter and de Lange, 2019; Loued-Khenissi et al., 2020; Weilnhammer et al., 2021; Ferrari et al., 2022; Horing and Büchel, 2022). Alternatively, sensory processing itself may be modulated by conceptual priors, although no specific exemplars or low-level features were predicted. This latter hypothesis is suggested by previous work demonstrating that words prime corresponding object images (Puri et al., 2009; Lupyan and Ward, 2013; Stein and Peelen, 2015), thus implying that conceptual knowledge might be used to predict sensory input. However, object congruency and generalizing statistical regularities via conceptual associations are different affairs, with the latter involving further abstraction and predictions between different object concepts.

Here, we examined whether and how priors derived from statistical regularities are automatically generalized across domains, via conceptual associations, to modulate visual processing. Moreover, we asked whether such conceptual priors, without exposure to predictable visual features, may nonetheless result in tuning specific sensory predictions. To this end, we first exposed participants to pairs of sequentially presented object words (e.g., car followed by dog). Word pairs were probabilistically associated, thus resulting in trailing words being predicted (expected) by virtue of the leading words. On the next day, participants performed a categorization task on the same object pairs, but the trailing (second) object word was replaced by an image of the corresponding object (e.g., the word car followed by an image of a dog). Crucially, the leading words were not predictive of the trailing object images. Thus, any predictions of the trailing images must have arisen because of a generalization from the previously learned word–word transitions to the corresponding word–image pairs. Our results revealed a tuning specific suppression of fMRI BOLD responses throughout the ventral visual stream, including early visual cortex (EVC), to object images that correspond to the (previously) expected words. These findings suggest that priors derived from conceptual associations result in category-specific predictions that modulate perceptual processing. Thus, the predictive brain appears to use prior knowledge beyond concrete perceptual associations to generate and test sensory predictions, including priors derived from recently acquired abstract associations.

Materials and Methods

Preregistration

The current study was preregistered at the Open Science Framework before any data were analyzed (https://osf.io/fzvps). This article discusses how perceptual processing is influenced by statistical regularities at a conceptual level. The experiment procedures were executed as outlined in the preregistration document, unless specified otherwise in the sections below.

Participants and data exclusion

Thirty-seven healthy participants were recruited from the Radboud University participant pool. We aimed for a sample size of 34 to obtain ≥80% power to detect a medium effect size (d = 0.5) at the standard α = 0.05. The study was approved by the local ethics committee (Committee on Research Involving Human Subjects (CMO), Arnhem-Nijmegen, Radboud University Medical Center) under the umbrella ethics approval for the Donders Center for Cognitive Neuroimaging (Imaging Human Cognition, CMO 2014/288). Written informed consent was obtained from each participant before the experiment. All subjects had normal or corrected-to-normal vision. Participants were invited for a behavioral session and an fMRI session on 2 consecutive days. Reimbursement was 8 euros/h for the behavioral session (day 1) and 10 euros/h for the MRI session (day 2). We excluded three participants based on our preregistered exclusion criteria. One participant was excluded from the fMRI analyses because of excessive head motion during scanning, defined as the percentage of framewise displacements exceeding 0.2 mm being 2 SDs above the group mean. Two additional participants were excluded from all analyses because of incomplete datasets, resulting from premature termination of the experiment because of participants not complying with task instructions. The remaining 34 participants (21 female; age 25 ± 3 years, mean ± SD; range 19–34 years) were included in all analyses.

Stimuli and experimental paradigm

Stimuli

We used a total of 16 stimulus categories presented as both words and images. All stimuli (both words and images) were from two superordinate categories, living and nonliving objects. Each superordinate animacy category consisted of several categories. Specifically, the living object categories were dog, elephant, mushroom, rose, man, woman, hand, foot. The nonliving object categories were airplane, car, backpack, hat, house, church, hammer, spoon. Throughout the article we consider all these categories as objects. The word stimuli were presented in English. These words subtended ∼2° to 5.8° of visual angle horizontally and 1° degree. The length of the words was between three and eight letters (5.1 ± 1.9, mean ± SD). All images were selected by a manual search, except for face images, which were chosen from the Radboud Face Database (Langner et al., 2010). The images spanned ∼6° × 6° visual angle. Of these 16 stimulus categories, four alive and four nonalive object words served as leading words. Similarly, four of each animacy category were used as trailing words/images. For each trailing image 10 different exemplar images were used to avoid repetition of the same exemplar within a run. All stimuli were presented on a midgray background. Custom scripts, written in MATLAB using the Psychophysics Toolbox extension (Brainard, 1997; Pelli, 1997; Kleiner et al., 2007), were used for stimulus presentation. During the learning session the stimuli were presented on an LCD screen (BenQ XL2420T, 1920 × 1080 pixel resolution, 60 Hz refresh rate). During MRI scanning stimuli were presented on a MRI compatible LCD screen (Cambridge Research BOLDscreen 32, 1920 × 1080 pixel resolution, 60 Hz refresh rate), visible using an adjustable mirror mounted on the head coil.

Procedure

Subjects participated in two sessions on two consecutive days. The first day was a behavioral session (learning session) and the second day an MRI session (generalization session), including an object categorization and a functional localizer task.

Learning session

Unbeknownst to the participants, the word–word session served as a learning session of the statistical regularities. During each trial, two words, referring to two different objects, were presented sequentially, each for 0.5 s, with an intertrial interval (ITI) of 1–3 s randomly sampled from a uniform distribution (Fig. 1A). The first word probabilistically predicted the identity of the second word. A fixation bull's-eye (0.3° visual angle in size) was presented throughout the experiment. The objects belonged to two groups, alive or not alive. The participants' task was to indicate whether the two objects presented on a trial were congruent (i.e., both living or both nonliving) or incongruent (i.e., one living and one nonliving). Participants had to respond within 1.5 s after the onset of the trailing word. Feedback on behavioral performance was provided at the end of each run, using accuracy and reaction time. The button mapping was counterbalanced across participants (odd/even subject IDs). There were in total seven runs, with each run consisting of 216 trials.

Figure 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 1.

Experimental paradigm. A, A trial of the learning session. Two object words were presented sequentially, without interstimulus interval (ISI), each lasting 500 ms. The first word probabilistically predicted the second word. Each trial ended with a 1–3 s ITI. B, A trial of t>he generalization session. Like the learning session, two objects were presented, but the trailing object word was replaced by a corresponding object image. Crucially, during this session the trailing object images were not predictable given the leading word. The leading word and trailing images were presented sequentially for 500 ms each, without ISI, followed by a 3–15 s ITI. C, The transitional probability matrix of the learning session, determining the associations between word pairs. L1 to L8 represent leading words and T1 to T8 represent the trailing words. Green labels indicate that the word refers to a living object, whereas red indicates a nonliving object. Blue and brown cells denote expected and unexpected word pairs, respectively. The number inside each cell indicates the number of trials in the corresponding conditions per run. D, The transitional probability matrix of the generalization session. The matrix was identical to the learning session except for three changes. First, T1–T8 represents trailing images instead of words. Moreover, a no-go condition was added in which the trailing images were of the same object as the leading words. Finally, the leading words were no longer predictive of the specific trailing stimulus; instead, one of two equiprobable trailing images were associated with each leading word, one of which was the previously expected object category. Thus, the blue cells represent the object images corresponding to the (previously) expected words, whereas the brown cells represent object images that correspond to the unexpected words.

Expectations were induced by manipulating the transitional probabilities of the word pairs. The transitional probabilities and the object words themselves were determined by an 8 × 8 transitional probability matrix (Fig. 1C). For each leading word, the paired (expected) trailing word occurred on ∼74.1% (20/27) of trials, whereas any other trailing word (unexpected) appeared on ∼3.7% (1/27) of trials. The specific pairings of leading and trailing words were randomized for each participant. Eight expected word pairs were randomly selected for each subject and balanced with respect to button mappings, that is, half of the object pairs were of the same category (both alive or both not alive), and the other half was from different categories.

On the next day, the participants performed an additional learning session to consolidate the learned associations. This session included two runs of the word–word task outside the MRI scanner and another 104 trials inside the MRI scanner, during which an anatomical image was acquired.

Generalization (MRI) session

During fMRI scanning, participants were exposed to word–image pairs. The leading words were the same as in the learning session, whereas the trailing words were replaced by corresponding object images. Each stimulus was presented for 500 ms followed by a variable ITI of 3–15 s (mean, 5 s), randomly sampled from a truncated exponential distribution. Participants indicated whether the trailing object was alive or not alive by button press. Additionally, we added a no-go condition, requiring participants to withhold their responses when the leading word and trailing image referred to the same object (e.g., the word dog followed by an image of a dog). The rationale for including these no-go trails was to ensure that participants also attended to the leading words. This was considered essential because the leading words served as a cue to generate the expectations of the trailing images, and previous studies suggest that unattended stimuli may not generate sensory predictions (Richter and de Lange, 2019). We also modified the transitional probabilities. Specifically, expected pairs remained the same, but one unexpected trailing image was selected for each leading word. Crucially, the expected and unexpected trailing images were now equiprobable, appearing equally often given a leading word. Therefore, the status of a trailing object image as expected or unexpected depended completely on the participants, spontaneously and without instruction, generalizing the associations learned during the learning session to the word–image pairs. For each trailing object category, we used 10 distinct exemplars, with each exemplar being shown only once as expected and once as an unexpected image per run. Moreover, the exemplars varied in orientation, color, shape, and other low-level visual features to reduce the possibility that participants associated specific low-level visual features with the object category. The selected expected and unexpected trailing images were always from different categories with respect to being (not) alive, such that participants could not predict the button response based on the leading words. Participants had to respond within 1.5 s after the onset of the trailing images. Feedback on accuracy and reaction time was provided at the end of each run. Participants first performed a brief practice run (12 trials, ∼2 min), which was followed by four runs of the main task. Each run (∼9 min) consisted of 88 trials, including 40 trials per expectation condition and 8 no-go trials. The presentation order was fully randomized.

Functional localizer

Finally, participants performed a functional localizer consisting of three types of stimuli–words, intact images, and globally phase-scrambled images. Participants were instructed to respond by button press whenever two consecutive images were identical while keeping their gaze at the central fixation point. The task consisted of three runs in a block design. Each run (∼8 min) included 16 word blocks, 16 intact image blocks, 8 phase-scrambled image blocks, and 4 null-event blocks. Each stimulus type was presented 15 times per block, each time flashing at 1.4 Hz (500 ms on, 200 ms off) for 10.5 s. The order of blocks (intact and phase-scrambled images) was fully randomized.

fMRI data acquisition

Functional and anatomical images were collected on a 3T Skyra MRI system (Siemens), using a 32-channel head coil. The functional images were acquired using a whole-brain T2*-weighted multiband four sequence (TR/TE = 1500/33.4 ms, 68 slices, voxel size 2 mm isotropic, FOV = 210 mm, 75° flip angle, anterior/posterior phase encoding direction, bandwidth = 2090 Hz/pixel). The anatomic images were acquired with a T1-weighted MP-RAGE (GRAPPA acceleration factor = 2, TR/TE = 2300/3.03 ms, voxel size 1 mm isotropic, 8° flip angle).

Questionnaire

After completing MRI scanning, participants filled in a short questionnaire assessing their explicit awareness of the word–word and word–image pairs. The questionnaire asked participants, for example, “In the first part of the experiment (word–word pairs), which object word(s) were most likely to appear after the word 'Dog'”? Similarly for the word–image pairs they were asked. “In the second part of the experiment (word–image pairs), which object image(s) were most likely to appear after the word 'Dog'”? Participants gave their response by typing the name of the object(s) into a text box.

Data analysis

Behavioral data analysis

Behavioral data were analyzed in terms of accuracy and reaction time (RT). The expectation benefits were defined as higher accuracy (expected–unexpected) and faster RTs (unexpected–expected) to expected stimuli. The accuracy was calculated separately for expected and unexpected trials for each subject and contrasted with a paired t test. For RT analysis, only correct responses were analyzed. The trials with RTs faster than 100 ms and slower than 1500 ms were rejected as outliers from further analysis. These RTs were then averaged for each expectation condition separately per participant and subjected to a two-by-two repeated measures ANOVA (RM ANOVA) with expectation (expected and unexpected) and runs (runs 1, 2 and runs 3, 4) as factors. We then used paired t tests to assess possible differences between expected and unexpected conditions within early (runs 1, 2) and late runs (runs 3, 4), respectively, as well as for all runs combined. Partial eta squared (η2) and Cohen's dz (Lakens, 2013) were calculated as effect size for the RM ANOVA and paired t test, respectively. Bayesian RM ANOVAs and t tests were used to assess any not statistically significant results to assess evidence for absence of an effect. All Bayesian analyses were implemented in JASP 0.17 software using default settings (Cauchy prior width of 0.50 for RM ANOVA and 0.707 for t test). All SEMs were calculated as the within-subject normalized SEM (Cousineau, 2005). Because the unexpected trailing word trials can require a change in the response during the learning session, we additionally assess expectation effects without response adjustments. To this end we split the unexpected trailing word trials into two groups depending on whether they required the same button response as the expected trailing words.

fMRI data preprocessing

fMRI data preprocessing was performed using Functional MRI of the Brain (FMRIB) Software Library (FSL) 5.0.9 (www.fmrib.ox.ac.uk/fsl; RRID:SCR_002823). The preprocessing pipeline included brain extraction (Brain Extraction Tool), motion correction [FMRIB Linear Image Registration Tool with motion correction (MCFLIRT)] and temporal high-pass filtering (128 s). For the univariate analysis, the data were spatial smoothed with a Gaussian kernel (5 mm FWHM). For the multivariate analysis, no spatial smoothing was applied. Functional images were registered to the anatomical image using boundary-based registration as implemented in FLIRT and subsequently normalized to the MNI-152 T1 2 mm template brain using linear registration with 12 degrees of freedom. To allow for signal stabilization, the first four volumes of each run were discarded.

Univariate data analysis

We modeled BOLD signal responses to the different experimental conditions by fitting voxelwise GLMs to data of each run, using the FSL fMRI Expert Analysis Tool (FEAT). We modeled the events of interest (expected, unexpected, and no-go) as three separate regressors with a duration of 1 s (the combined duration of leading word and trailing image) and convolved them with a double gamma hemodynamic response function. The contrast of interest was expected minus unexpected images, hence negative parameter estimates indicating expectation suppression. In addition, the first-order temporal derivatives of the regressors of interest and 24 motion regressors (FSL standard plus extended set of motion parameters) were added as nuisance regressors. FSL FEAT was used to combine data across runs, whereas FSL FLAME 1 (FMRIB Local Analysis of Mixed Effects) was used to combine data across participants. Gaussian random-field cluster thresholding was used to correct for multiple comparisons, using the default settings of FSL, with a cluster formation threshold of p < 0.001 (z ≥ 3.1) and a cluster significance threshold of p < 0.05.

Region of interest analysis

To examine expectation suppression in the ventral visual stream, we selected three regions of interest (ROIs) at different levels of the visual hierarchy—EVC, lateral occipital cortex (LOC), and ventral temporal cortex (VTC). For each ROI, the mean parameter estimates were extracted separately for expected and unexpected conditions per participant in native space (i.e., without normalization to MNI space). The parameter estimates were then divided by 100 to yield an approximation of percentage signal change relative to baseline (https://jeanettemumford.org/assets/files/perchange_guide.pdf). These mean parameter estimates were submitted to paired t tests for each ROI. We used the Benjamini and Hochberg (1995) procedure to control the false discovery rate (FDR) at 0.05. Reported p values were adjusted for multiple ROIs using FDR correction [pcorrected = puncorrected * (number of tests)/(rank of puncorrected)]. Cohen's dz was calculated as a measure of effect size. Additionally, similar to the RT analysis, we performed a two-by-two RM ANOVA with expectation (expected and unexpected) and runs (runs 1, 2 and runs 3, 4) as factors for each ROI to assess any differences in expectation suppression magnitudes between the first half (runs 1, 2) and second half (runs 3, 4) of the MRI session. Differences in expectation suppression between the two halves may indicate extinction of the previously learned associations. Partial eta squared (η2) was calculated to express effect sizes for the ANOVA. Moreover, Bayesian RM ANOVAs were conducted to assess evidence for the absence of an effect.

ROI definition

To examine the expectation effects throughout the visual hierarchy, we defined three ROIs, EVC, object-selective LOC, and category-selective VTC. The preregistered EVC and LOC ROIs were used to investigate modulations by expectation at the low-level and intermediate object-selective visual regions. In addition to the ROIs from the preregistration, we added VTC to further examine potential activity modulations by expectation in higher category-selective regions that are sensitive to the distinction between living and nonliving objects (Thorat et al., 2019). The category-selective ROI definition in the preregistration document specified a split into four categories (faces, body parts, buildings, tools). However, during analysis it became apparent that splitting the data into four categories resulted in too few trials per category and hence an underpowered analysis. We therefore chose to adjust our ROIs to form a combined VTC mask. Results using the four category-specific masks are included in an openly accessible data repository (https://doi.org/10.34973/3nkk-gn64).

EVC was defined as the union of V1 and V2. FreeSurfer 6.0 (RRID:SCR_001847) software was used to extract labels for V1 and V2 per subject based on their anatomical image. These labels were then transformed to native EPI space and combined into a bilateral mask. Object-selective LOC was defined as voxels within an anatomical LOC mask, derived from the Harvard–Oxford cortical atlas, that were more responsive to intact compared with phase-scrambled objects. To this end, we modeled the three types of stimuli during the fMRI localizer runs (words and intact and phase-scrambled objects) with their corresponding duration (10.5 s). First-order temporal derivatives and 24 motion regressors were added as nuisance regressors. The contrast of the intact objects minus scrambled objects was thresholded at z ≥ 4.3 (one sided, p < 1e-5) and further constrained by the anatomical LOC mask. The resulting LOC masks all contained at least 600 voxels in native space per participant. The VTC ROI mask was created using anatomical masks from the Harvard–Oxford cortical atlas, including the temporal-occipital fusiform cortex, the temporal gyrus, and the parahippocampal gyrus. The resulting mask was transformed from MNI space to the participants' native space using FSL FLIRT. Finally, we constrained each of the ROI masks to the most informative voxels regarding object image category. Specifically, we decoded object category during the functional localizer run per participant (see below, Multivoxel pattern analysis) and then for each ROI and subject selected the 300 voxels forming the most informative neighborhood in the decoding analysis. As a robustness check for all ROI analyses, we repeated each analysis with mask sizes ranging from 50 to 600 voxels in steps of 50 voxels.

Expectation suppression selectivity analysis

In addition to the main expectation analysis, we also performed an analysis assessing the selectivity of the expectation modulation. First, a GLM-based analysis was performed using the data from the generalization session. However, now we modeled the expected and unexpected events for each aliveness category separately, thus resulting in four regressors of interest (expected alive, unexpected alive, expected not alive, and unexpected not alive). Additionally, we also modeled the no-go condition and the nuisance regressors (same as for other analyses). Next, for the three anatomical masks (EVC, LOC, and VTC) we selected voxels that preferentially responded to images of living or nonliving objects. Preference was estimated using the independent localizer data, for which we modeled the image stimuli separately for living and nonliving objects. Then the contrast living–nonliving was used to identify 300 voxels that were more responsive (highest z scores) to alive compared with not alive objects. The inverse contrast (nonliving–living) was used to identify 300 voxels that were more responsive to images of nonliving objects. The ROI masks, split into living preferred and nonliving preferred, were then applied to the main task data to obtain the mean parameter estimates for expected and unexpected trailing images per aliveness category. The mean parameter estimates were then averaged by stimulus preference. In other words, living stimuli contributed to living preferring voxels together with nonliving stimuli for nonliving preferring voxels. Thus, this analysis results in four data points per participant and ROI (EVC, LOC, VTC), expected preferred stimuli, unexpected preferred stimuli, expected nonpreferred stimuli, and unexpected nonpreferred stimuli. For each ROI, the data were submitted to a two-by-two RM ANOVA with expectation (expected, unexpected) and preference (preference or nonpreference) as factors. The BOLD response difference between expected and unexpected trials for each preference category was examined using simple paired t tests. Reported p values were corrected for multiple comparisons (three ROIs and two preference conditions) using FDR correction. Partial eta squared (η2) and Cohen's dz were calculated as effect size for the RM ANOVA and paired t test, respectively. Bayesian RM ANOVAs and t tests were used to assess evidence for the absence of an effect.

Multivoxel pattern analysis

For the multivoxel pattern analyses, no spatial smoothing was applied. Parameter estimate maps per localizer trial were estimated using a GLM-based Least Squares- Separate (LSS) approach (Mumford et al., 2012). Each model contained four regressors of interest, one for the trial of interest and three for all other trials per condition (expected, unexpected, and no-go). The resulting parameter estimate maps were subsequently used in a searchlight analysis with a sphere of a 6 mm radius. A leave-one-run-out classifier, using a linear support vector machine, was fit to the four object classes (faces, body parts, buildings, and tools). The classifier was trained and tested on the independent localizer data across the whole brain for each participant in native space. The resulting decoding accuracy maps were then used to constrain the ROI masks per participant to select the most informative voxels. Note that decoding object category in this manner does not imply that voxel neighborhoods with above chance decoding accuracy necessarily represent abstract concepts such as category membership. Instead, the classifier will exploit any systematic differences in visual and other nonvisual characteristics between the categories, including communalities in low-level features, such as higher correlations between object shapes within categories.

Software

Psychtoolbox (Brainard, 1997; Pelli, 1997; Kleiner et al., 2007) software running on MATLAB R2017b (MathWorks; RRID:SCR_001622) was used for stimuli presentation. fMRI data preprocessing and analysis was performed using FSL 5.0.9 (RRID:SCR_002823) and FreeSurfer 6.0 (RRID:SCR_001847). Python 3.7.4 (RRID:SCR_008394) was used for data processing with the following libraries: NumPy 1.17.2 (van der Walt et al., 2011), Pandas 0.25.1 (https://pandas.pydata.org/), nilearn 0.8.1 (Abraham et al., 2014), Scikit-learn 0.18.1 (RRID:SCR_002577). Matplotlib 3.1.1 (Hunter, 2007) and Nanslice (Wood, 2017/2020) were used for data visualization (https://github.com/spinicist/nanslice). JASP 0.17 (https://jasp-stats.org; RRID:SCR_015823) was used for statistical tests, including RM ANOVA, paired t tests, and Bayesian analyses.

Data availability

All data and code are openly available at the Donders Institute for Brain, Cognition, and Behavior repository at https://doi.org/10.34973/3nkk-gn64.

Results

Word associations facilitate categorization performance of associated object images

During the first session, participants were exposed to predictable object word pairs. In these pairs, the first word probabilistically predicted the identity of the second word, allowing participants to anticipate the trailing word based on the leading word. Data from this learning session (word pairs) were analyzed in terms of accuracy and RT. Unexpected trailing words could require the same or different response compared with expected words. Thus, we analyzed unexpected words requiring the same or different responses separately, as the latter required additional response adjustments. Our results (Fig. 2A) demonstrated that participants categorized the trailing words in the expected condition more accurately (96.2 ± 0.7% vs 91.1 ± 0.7%, mean ± SE; t(34) = 7.798, p = 4.5e-9, dz = 1.337) and faster (654 ± 5.7 ms vs 748 ± 4.7 ms; t(34) = 12.426, p = 3.4e-14, dz = 2.131) than the trailing words in the unexpected condition that required the same response. In addition, participants also categorized the unexpected trailing words more accurately (t(34) = 2.735, p = 0.009, dz = 0.469) and faster (t(34) = 2.845, p = 0.007, dz = 0.488) when the trailing words required the same response compared with the different response. Figure 2B depicts the behavioral results of the word pair learning session per block (each 216 trials). A reliable effect of prediction (all p values < 0.014, all dz values > 0.568, FDR corrected) was already apparent during the first block, indicating rapid learning of the word–word pairs. These results suggest that participants learned the associations between the word pairs and used them to predict the identity of the trailing words.

Figure 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 2.

Word–word associations aid in the classification of category congruency for word–word pairs. A, Behavioral benefits of prediction (expectation) for the word pairs indicates the learning of word associations during the learning session. Responses to expected trailing words were significantly more accurate (top) and faster (bottom) compared with unexpected words that required the same (Unexpected-S) or different response (Unexpected-D). B, Development of behavioral benefits of prediction during the learning session. Responses to expected trailing words were more accurate (top) and faster (bottom) compared with unexpected words that required the same (Unexpected-S) or different response (Unexpected-D) across all learning blocks, demonstrating rapid learning. Error bars indicate within-subject SE.

During the second session, participants were exposed to word–image pairs. For each word pair, the trailing object word was replaced by an image of the corresponding object (e.g., the word car followed by a picture of a dog). Additionally, the leading words were no longer predictive of the trailing objects. We analyzed the behavioral data from this generalization (MRI) session to assess the spontaneous generalization of word pairs from the learning session to the word–image pairs. Overall, participants categorized the trailing images with high accuracy in both the expected (97.3 ± 0.6%, mean ± SE) and the unexpected conditions (97.4 ± 0.7%, mean ± SE). Additionally, participants also correctly rejected the no-go trials (93.0 ± 1.2%, mean ± SE), indicating good task compliance. Figure 3A shows that response accuracy for expected and unexpected trailing images did not differ significantly from each other (t(34) = 0.424, p = 0.675, dz = 0.073). Given the high accuracy in both conditions (>97%), this null result may reflect a ceiling effect. However, RTs for expected object images were faster than for unexpected objects (644.5 ± 1.7 ms vs 651.7 ± 1.7 ms, mean ± SE; t(34) = 2.968, p = 0.005, dz = 0.509). Crucially, although we refer to expected and unexpected object images here, the trailing objects were in fact only (un-)expected by virtue of the associations learned during the learning session, as the trailing object images were equiprobable during the generalization session. Thus, participants appeared to have generalized the associations from the word–word pairs to speed up the categorization of the corresponding expected trailing object images.

Figure 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 3.

Word–word associations facilitate behavioral responses to corresponding word–image pairs. A, Behavioral performance for the category classification task during the generalization (MRI) session. Leading words were not predictive of the trailing images in the generalization session. Therefore, any behavioral benefits of prediction must have been derived from the word–word associations learned during the learning session. Responses were highly accurate (left) and did not differ between expectation conditions. RTs (right) were significantly faster to expected trailing object images compared to unexpected object images, indicating the generalization of associations from the word-word pairs to the word-image pairs. B, RTs to expected (blue) and unexpected (brown) object images for early runs (Run 1+2) and late runs (Run 3+4). Error bars indicate within-subject SE; **p < 0.01.

Finally, we assessed whether the behavioral prediction effect extinguished over time (runs), given that there were no predictive relationships during the second (MRI) session. Results of a two-by-two repeated-measures ANOVA, shown in Figure 3B, yielded an overall decrease of RTs over time, indicating that participants improved in task performance (main effect of runs, early vs late runs, F(1,34) = 73.949, p = 4.8e-10, η2 = 0.685). Moreover, responses were faster to expected stimuli (main effect of expectation, F(1,34) = 9.452, p = 0.004, η2 = 0.218), showing the response speeding because of prediction. However, despite a visible trend, suggesting that RT benefits may have decreased over time, we failed to detect a reliable interaction between runs and expectation (F(1,34) = 3.724, p = 0.062, η2 = 0.099). Results from a Bayesian RM ANOVA were inconclusive, providing only anecdotal support for an interaction (BF10 = 1.473). Thus, although the RT benefit because of prediction was numerically smaller during the second half of the MRI session (RT benefit, first half = 11.3 ± 3.3 ms vs second half = 3.0 ± 3.0 ms) this difference was not statistically reliable.

Following MRI scanning, participants filled in a brief questionnaire, assessing their explicit awareness of the object pairs. For the word–word pairs, participants correctly identified 6.6 ± 1.7 (mean ± SD; i.e., 82.5%; chance level 12.5%) of the eight pairs, demonstrating explicit knowledge of the word–word associations. For the word–image pairs, participants indicated the previously expected object for 2.8 ± 2.8 (mean ± SD; 35%) of the eight pairs. Evaluating these results is challenging, because two of the eight trailing images were shown equally often following each leading word. Nonetheless, the results show that participants were aware of the word–word associations but significantly unlearned or only partially explicitly generalized the associations to the word–image pairs during MRI scanning. However, as the assessment of explicit awareness of the word–image pairs was performed after MRI scanning, the present study cannot differentiate well between implicit and explicit contributions to the here observed prediction effect.

Conceptual predictions modulate sensory processing

Having established the generalization of associations from word–word pairs to word–image pairs in terms of a behavioral facilitation of reaction times to previously expected objects, we next turned to investigate whether neural responses to the object images were modulated by the previously learned word pair associations. To this end, we first performed an ROI analysis targeting three distinct levels of the visual processing hierarchy, EVC, LOC, and VTC (Fig. 4A). These three ROIs were defined to investigate possible prediction-induced activity modulations in low-level visual areas (EVC), intermediate object-selective regions (LOC), and higher category-selective visual cortex (VTC; see above, Materials and Methods, ROI definition). Our results, depicted in Figure 4B, demonstrated suppressed BOLD responses for expected compared with unexpected object images in EVC (t(33) = 3.628, p = 0.003, dz = 0.622), object-selective LOC (t(33) = 2.782, p = 0.009, dz = 0.477) and VTC [t(33) = 2.851, p = 0.009, dz = 0.489; reported p values were corrected for multiple comparisons (three ROIs) using FDR correction]. To verify that our results did not depend on the specific ROI mask size, we successfully replicated all ROI results with mask sizes ranging from 50 to 600 voxels in steps of 50. Complementary to the ROI analysis, we performed a whole-brain analysis (see above, Materials and Methods, Univariate data analysis). Figure 4C shows that expected object images resulted in attenuated brain activity throughout the ventral visual stream compared with unexpected objects. This reduction of neural activity by predictions, also known as expectation suppression, was primarily evident in the ventral visual stream, including parts of lingual and fusiform gyrus, calcarine sulcus, and cuneus. Together, these results suggest that the word pair associations resulted in a suppression of sensory responses to the corresponding expected object images across major parts of the ventral visual stream, including EVC.

Figure 4.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 4.

Expectation suppression across the ventral visual stream. A, Three anatomical masks in the ventral visual pathway, EVC (top), object-selective LOC (middle), and VTC (bottom). These anatomical masks were further constrained per participant using independent localizer data (see above, Materials and Methods, ROI definition). B, Averaged BOLD responses (parameter estimates) to expected (blue) and unexpected (brown) object images within EVC, LOC, and VTC. In all three ROIs, BOLD responses were significantly suppressed to the expected compared with unexpected object images. Error bars indicate within-subject SE; **p < 0.01; p values were adjusted for three comparisons (ROIs) using FDR correction. C, Expectation suppression revealed by whole-brain analysis. Color represents the parameter estimates for the contrast expected minus unexpected, displayed on the MNI-152 template brain. Blue clusters represent decreased activity for expected compared with unexpected object images. Opacity indicates the z statistics of the contrast. Black contours outline statistically significant clusters (Gaussian random field cluster corrected). Significant clusters were observed in the ventral visual stream, including EVC, LOC, and VTC.

In addition, we investigated whether the neural prediction effect (expectation suppression) differed between the first (runs 1, 2) and second half (runs 3, 4) of the MRI session, which would indicate extinction of the previously learned word–word associations, given the nonpredictive word–image associations during MRI scanning. To this end three two-by-two repeated-measures ANOVAs were performed, one per ROI. As shown in Figure 5, BOLD responses decreased over time, confirmed by a significant main effects of run in each of the ROIs (EVC, F(1,33) = 4.673, p = 0.038, η2 = 0.124; LOC, F(1,33) = 12.374, p = 0.002, η2 = 0.273; VTC, F(1,33) = 20.205, p = 2.4e-4, η2 = 0.380). Additionally, we also observed main effects of expectation in all ROIs (EVC, F(1,33) = 15.249, p = 0.001, η2 = 0.316; LOC, F(1,33) = 9.812, p = 0.004, η2 = 0.229; VTC, F(1,33) = 9.691, p = 0.004, η2 = 0.227), echoing the suppressed responses to expected compared with unexpected stimuli seen in Figure 4B. However, there was no significant interaction between expectation and run (EVC, F(1,33) = 0.429, p = 0.995, η2 = 0.013; LOC, F(1,33) = 0.005, p = 0.995, η2 = 1.6e-4; VTC, F(1,33) = 4.2e-5, p = 0.995, η2 = 1.0e-5), suggesting that expectation suppression magnitudes did not differ between the first and second half of the MRI session. This was confirmed by Bayesian RM ANOVAs providing moderate support for the absence of an interaction (EVC, BF10 = 0.316; LOC, BF10 = 0.232; VTC, BF10 = 0.264). Thus, the extinction of previously acquired predictions may take significant exposure to the new nonpredictive associations.

Figure 5.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 5.

Expectation suppression during early and late runs. Averaged BOLD responses to expected (blue) and unexpected (brown) object images for early runs (Run 1+2) and late runs (Run 1+2) within EVC (left), LOC (middle), and VTC (right). Across all three ROIs expectation suppression did not significantly extinguish over time (runs). Error bars indicate within-subject SE; *p < 0.05, **p < 0.01, ***p < 0.001 (FDR corrected).

Category-specific expectation suppression from conceptual associations

Next, we investigated whether the observed neural suppression in sensory cortex was category specific, and thus dependent on neural tuning, or reflected a general and unspecific upregulation of neural responses to surprising (unexpected) input, regardless of tuning. To this end, we first generated stimulus preference ROIs by splitting voxels (neural populations) within each anatomic mask into two populations depending on whether they preferentially responded to images of living (e.g., faces and body parts) or nonliving (e.g., houses and tools) objects (see above, Materials and Methods, Expectation suppression selectivity analysis). BOLD responses to expected and unexpected objects were extracted within the resulting ROIs for each superordinate category (living vs nonliving) and averaged separately depending on whether the object image was of the preferred or nonpreferred superordinate category. Our results, depicted in Figure 6, showed that expectation suppression was robustly present in all three visual ROIs when the category of the trailing images was preferred (EVC, t(33) = 2.913, p = 0.013, dz = 0.500; LOC, t(33) = 4.062, p = 8.5e-4, dz = 0.697; VTC, t(33) = 4.710, p = 2.6e-4, dz = 0.808). However, there was no reliable suppression of neural responses when the trailing images were not preferred (EVC, t(33) = −0.988, p = 0.360, dz = −0.169; LOC, t(33) = 0.928, p = 0.360, dz = 0.159; VTC, t(33) = 0.992, p = 0.360, dz = 0.170). This pattern of results was also confirmed by significant interaction effects (EVC, F(1,33) = 7.507, p = 0.015, η2 = 0.185; LOC, F(1,33) = 5.582, p = 0.024, η2 = 0.145; VTC, F(1,33) = 7.507, p = 0.015, η2 = 0.186) between expectation (expected, unexpected) and preference (preferred, nonpreferred). Furthermore, Bayesian t tests showed moderate support for the absence of expectation suppression for nonpreferred stimuli (EVC, BF10 = 0.288; LOC, BF10 = 0.273; VTC, BF10 = 0.289). Crucially, neural populations showed reliable activations also to the nonpreferred stimuli in all ROIs (all p values < 3.0e-10, all dz values > 1.6); hence, the absence of expectation suppression for nonpreferred stimuli could not be explained by a lack of visual responsiveness. In sum, our results showed that expectations derived from conceptual associations selectively suppressed responses to preferred stimulus categories, highlighting that sensory predictions were category specific.

Figure 6.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 6.

Expectation suppression only for preferred object stimuli. BOLD responses to expected (blue) and unexpected (brown) object images for preferred and nonpreferred stimuli within EVC (left), LOC (middle), and VTC (right). In all three ROIs, BOLD responses were suppressed to expected object images exclusively when the object category was preferred. BOLD responses did not differ between expected and unexpected images for nonpreferred object images. Error bars indicate within-subject SE; *p < 0.05, ***p < 0.001 (FDR corrected), = BF10 < 1/3.

Discussion

Sensory priors, derived from perceptual associations, play a crucial role in modulating visual processing (de Lange et al., 2018). However, less is known about how conceptual associations influence vision. Here, we examined whether conceptual associations serve as sensory priors and how visual processing of object images is modulated by such priors. We first exposed participants to probabilistically associated word–word pairs. Participants were not informed of the underlying statistical regularities, nor required to learn them. However, the regularities could be used to facilitate performance. On the subsequent day, the trailing (second) words were replaced with pictures of the corresponding objects, resulting in word–image pairs. We observed faster classifications of object images corresponding to previously expected trailing words compared with unexpected ones. Crucially, during fMRI scanning, object images were equiprobable, hence not allowing for any statistical learning of the word–image pairs. Thus, participants must have generalized the previously learned word–word associations to the conceptually matching word–image pairs. Our results showed suppressed BOLD responses to object images corresponding to the previously expected trailing words compared with unexpected ones throughout the ventral visual stream, including early visual cortex. This prediction-induced suppression of sensory responses was category specific, interacting with the tuning of the neural populations. Specifically, expectation suppression was exclusively found for preferred but not for nonpreferred superordinate object categories, ruling out unspecific global surprisal as the source of the observed effect. Together, our results show how priors learned in a different domain, here linguistically, can be generalized via conceptual associations to subsequently sensory processing.

Prediction errors in (early) visual cortex from conceptual category priors

Participants learned word–word pairs, hence, any predictions during exposure to the word–image pairs must have originated at the level of category expectations, such as predicting the category dog given a leading word. Dogs can have different shapes and sizes and can assume different positions, or more generally speaking, low-level visual features can vary dramatically yet still refer to the same object. Moreover, numerous exemplars per category were presented without exemplar repetitions within a run. Thus, it seems unlikely that category priors resulted in a predictive preactivation of specific low-level features that are relevant for EVC representations, such as oriented edges in retinotopically specific locations (Hubel and Wiesel, 1962). Yet, we did observe tuning-specific expectation suppression in EVC. How can we account for this observation?

It is possible that category expectations initially resulted in predictions of semantic and high-level visual representations of the associated object categories. That is, dogs share high-level (visual) features, such as being animate and having four legs and a nose. Thus, expected stimuli may result in facilitated neural processing initially by virtue of those shared features. As predictions spread throughout the visual hierarchy (Schwiedrzik and Freiwald, 2017), EVC may subsequently be modulated by recurrent hierarchical processing. That is, if processing in higher (visual) areas is facilitated by the categorical predictions, then lower visual areas will also converge faster, or more efficiently, on a valid interpretation of the current visual stimulus because of more reliable and abundant feedback signals from higher visual areas. This explanation fits well within a hierarchical predictive coding framework (Friston, 2005; Rao and Ballard, 1999; for review, see Walsh et al., 2020). Crucially, in the context of predictive processing theories, predictions are not necessarily about future stimuli but are relayed top down at each level of the cortical hierarchy, aimed at predicting bottom-up input. Hence, the present results may represent a faster and more efficient resolution of prediction errors throughout the visual system because of valid predictions. Specifically, conceptual and high-level visual feature predictions aid in resolving prediction errors in higher (visual) cortical areas, which through recurrent message passing across the visual hierarchy in turn also reduce prediction errors (i.e., explain away activity) in early visual cortex.

It is important to note that the present results cannot be explained by neural adaptation or repetition suppression/priming (Kourtzi and Kanwisher, 2001; Henson and Rugg, 2003; Kristjánsson et al., 2007; Brinkhuis et al., 2020). Although repetition suppression is a closely related phenomenon, it can be differentiated from expectation suppression both conceptually and experimentally (Kaliukhovich and Vogels, 2011; Todorovic and de Lange, 2012; Summerfield and de Lange, 2014). Here, we introduced expectations about upcoming stimuli, but the repetition frequency of stimuli did not differ between the expectation conditions.

Conceptual category priors dampen sensory representations

Hierarchical predictive coding accounts also suggest that prediction errors are tuning specific, that is, representing features relevant to the different levels of the visual hierarchy (Friston, 2005; Clark, 2013; Walsh et al., 2020). An alternative account may hold that expectation suppression represents unspecific modulations, such as generic surprisal. This surprise signal, resulting in increases in arousal or attention, could then yield a proportional increase of sensory responses to the unexpected stimuli without any (feature specific) prediction error in sensory cortex (Alink and Blank, 2021). Our observation that expectation suppression was exclusively present for preferred superordinate stimulus categories contradicts this latter account because an unspecific surprise signal should also be evident for nonpreferred stimuli (proportional to the sensory response), which was not observed. Instead, selective suppression for preferred stimulus categories suggests that expectations derived from conceptual associations dampen sensory representations of expected input, thus resembling previous reports of sensory dampening following visual statistical learning (Meyer and Olson, 2011; Kumar et al., 2017; Richter et al., 2022).

Expectation suppression and dampening of neural representations may serve multiple functions. Suppressing activity of predictable input may preserve processing and attentional resources, as well as reduce redundancy in sensory cortex, thus reflecting more efficient perceptual processing. Moreover, by dampening representations of predicted input, surprising potentially important stimuli automatically attract attention and become prioritized. Note, however, that this perceptual prioritization does not imply that behavioral responses to surprising inputs are faster compared with expected inputs because additional processes (e.g., postperceptual response preparation) strongly modulate behavioral responses. Rather, dampening facilitates processing of surprising inputs compared with processing the same surprising stimuli without a relative dampening of expected inputs.

Sensory priors from generalized associations following incidental statistical learning

In line with the present results, prior work showed that the expectation of higher-level attributes of complex stimuli (e.g., category expectations) can facilitate perceptual processing (Lupyan and Ward, 2013; Stein and Peelen, 2015), hence suggesting that conceptual knowledge can serve as a prior for visual processing. However, a crucial difference between previous work and the present study is that we assessed the learning, generalization, and subsequent sensory consequences of novel (arbitrary) associations following incidental statistical learning. That is, here we do not rely on well-established congruency effects, such as the word dog predicting the image of a dog (Puri et al., 2009; Gandolfo and Downing, 2019). Rather, we show how statistical regularities are acquired incidentally in one domain (the word car predicting the word dog), and residing at a conceptual level, abstracted away from the lexical items, can subsequently serve as a sensory prior facilitating processing of the visual features associated with the predicted image (a picture of a dog following the word car). Interestingly, we observed this generalization, although prior work has demonstrated that statistical learning preferentially operates at less abstract (object exemplar) levels (Emberson and Rubinstein, 2016). Moreover, priors derived from the word pairs were applied to the word–image pairs, although the new statistical environment did not allow for reliable predictions, demonstrating the propensity of the brain to learn and use sensory priors. Thus, high-level conceptual associations appear to be readily used to construct concrete sensory predictions, thereby using knowledge of relationships between stimuli abstracted beyond the domain in which they were initially learned.

Interpretational limitations

Words referring to objects can modulate neural activity in category-selective visual areas (Kan et al., 2003; Kiefer, 2005). These results raise the possibility that the prediction effect in the present study was induced by the leading words triggering the associated trailing words, which in turn may have automatically activated corresponding category-selective visual areas. An unexpected object image would then lead to activation in addition to the anticipated stimulus, superficially appearing as a prediction error. Although this account cannot be ruled out conclusively, our data speak against it. We did not observe expectation suppression for nonpreferred categories, although the neural populations were reliably driven by the nonpreferred stimuli. If leading words automatically recall trailing words, which in turn lead to the activation of category-selective visual areas, we would have expected to see enhanced responses to nonpreferred unexpected stimuli as well. Instead, conceptual predictions appear to selectively modulate sensory processing.

Conclusion

In sum, our results demonstrate a widespread modulation of visual processing by conceptual associations. Crucially, these associations were based on priors generalized across domains from words pairs to word–image pairs, following incidental statistical learning. Thus, the sensory brain appears to spontaneously use various sources for forming and applying predictions, including high-level priors based on recently acquired conceptual associations. Such predictions subsequently modulate neural processing in a category-specific fashion throughout the ventral visual hierarchy, including early visual cortex, resulting in a marked suppression of sensory responses to expected compared with surprising visual input.

Footnotes

  • This work was supported by the European Commission Horizon 2020 Program European Research Council Starting Grant 678286 to F.P.d.L. and China Scholarship Council Grant CSC201708330238 to C.Y. We thank Xueyan Jiang for assistance with data acquisition.

  • The authors declare no competing financial interests.

  • Correspondence should be addressed to David Richter at david.richter.work{at}gmail.com

SfN exclusive license.

References

  1. ↵
    1. Abraham A,
    2. Pedregosa F,
    3. Eickenberg M,
    4. Gervais P,
    5. Mueller A,
    6. Kossaifi J,
    7. Gramfort A,
    8. Thirion B,
    9. Varoquaux G
    (2014) Machine learning for neuroimaging with scikit-learn. Front Neuroinform 8:14. https://doi.org/10.3389/fninf.2014.00014 pmid:24600388
    OpenUrlCrossRefPubMed
  2. ↵
    1. Alink A,
    2. Blank H
    (2021) Can expectation suppression be explained by reduced attention to predictable stimuli? Neuroimage 231:117824. https://doi.org/10.1016/j.neuroimage.2021.117824 pmid:33549756
    OpenUrlCrossRefPubMed
  3. ↵
    1. Bar M
    (2007) The proactive brain: using analogies and associations to generate predictions. Trends Cogn Sci 11:280–289. https://doi.org/10.1016/j.tics.2007.05.005 pmid:17548232
    OpenUrlCrossRefPubMed
  4. ↵
    1. Benjamini Y,
    2. Hochberg Y
    (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc Series B Stat Methodol 57:289–300. https://doi.org/10.1111/j.2517-6161.1995.tb02031.x
    OpenUrl
  5. ↵
    1. Brainard DH
    (1997) The Psychophysics Toolbox. Spatial Vis 10:433–436. https://doi.org/10.1163/156856897X00357
    OpenUrlCrossRef
  6. ↵
    1. Brinkhuis MAB,
    2. Kristjánsson Á,
    3. Harvey BM,
    4. Brascamp JW
    (2020) Temporal characteristics of priming of attention shifts are mirrored by BOLD response patterns in the frontoparietal attention network. Cereb Cortex 30:2267–2280. https://doi.org/10.1093/cercor/bhz238 pmid:31701138
    OpenUrlPubMed
  7. ↵
    1. Clark A
    (2013) Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behav Brain Sci 36:181–204. https://doi.org/10.1017/S0140525X12000477 pmid:23663408
    OpenUrlCrossRefPubMed
  8. ↵
    1. Cousineau D
    (2005) Confidence intervals in within-subject designs: a simpler solution to Loftus and Masson's method. Quant Method Psychol 1:42–45. https://doi.org/10.20982/tqmp.01.1.p042
    OpenUrl
  9. ↵
    1. de Lange FP,
    2. Heilbron M,
    3. Kok P
    (2018) How do expectations shape perception? Trends Cogn Sci 22:764–779. https://doi.org/10.1016/j.tics.2018.06.002 pmid:30122170
    OpenUrlCrossRefPubMed
  10. ↵
    1. Emberson LL,
    2. Rubinstein DY
    (2016) Statistical learning is constrained to less abstract patterns in complex sensory input (but not the least). Cognition 153:63–78. https://doi.org/10.1016/j.cognition.2016.04.010 pmid:27139779
    OpenUrlPubMed
  11. ↵
    1. Ferrari A,
    2. Richter D,
    3. de Lange FP
    (2022) Updating contextual sensory expectations for adaptive behavior. J Neurosci 42:8855–8869. https://doi.org/10.1523/JNEUROSCI.1107-22.2022 pmid:36280262
    OpenUrlAbstract/FREE Full Text
  12. ↵
    1. Friston K
    (2005) A theory of cortical responses. Philos Trans R Soc Lond B Biol Sci 360:815–836. https://doi.org/10.1098/rstb.2005.1622 pmid:15937014
    OpenUrlCrossRefPubMed
  13. ↵
    1. Gandolfo M,
    2. Downing PE
    (2019) Causal evidence for expression of perceptual expectations in category-selective extrastriate regions. Curr Biol 29:2496–2500.e3. https://doi.org/10.1016/j.cub.2019.06.024 pmid:31327721
    OpenUrlCrossRefPubMed
  14. ↵
    1. Goujon A,
    2. Fagot J
    (2013) Learning of spatial statistics in nonhuman primates: contextual cueing in baboons (Papio papio). Behav Brain Res 247:101–109. https://doi.org/10.1016/j.bbr.2013.03.004 pmid:23499707
    OpenUrlPubMed
  15. ↵
    1. Henson RNA,
    2. Rugg MD
    (2003) Neural response suppression, haemodynamic repetition effects, and behavioural priming. Neuropsychologia 41:263–270. https://doi.org/10.1016/s0028-3932(02)00159-8 pmid:12457752
    OpenUrlCrossRefPubMed
  16. ↵
    1. Horing B,
    2. Büchel C
    (2022) The human insula processes both modality-independent and pain-selective learning signals. PLOS Biol 20:e3001540. https://doi.org/10.1371/journal.pbio.3001540 pmid:35522696
    OpenUrlPubMed
  17. ↵
    1. Hubel DH,
    2. Wiesel TN
    (1962) Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. J Physiol 160:106–154. https://doi.org/10.1113/jphysiol.1962.sp006837 pmid:14449617
    OpenUrlCrossRefPubMed
  18. ↵
    1. Hunt RH,
    2. Aslin RN
    (2001) Statistical learning in a serial reaction time task: access to separable statistical cues by individual learners. J Exp Psychol Gen 130:658–680. https://doi.org/10.1037//0096-3445.130.4.658 pmid:11757874
    OpenUrlPubMed
  19. ↵
    1. Hunter JD
    (2007) Matplotlib: a 2D graphics environment. Comput Sci Eng 9:90–95. https://doi.org/10.1109/MCSE.2007.55
    OpenUrlCrossRefPubMed
  20. ↵
    1. Kaliukhovich DA,
    2. Vogels R
    (2011) Stimulus repetition probability does not affect repetition suppression in macaque inferior temporal cortex. Cereb Cortex 21:1547–1558. https://doi.org/10.1093/cercor/bhq207 pmid:21097993
    OpenUrlCrossRefPubMed
  21. ↵
    1. Kan IP,
    2. Barsalou LW,
    3. Olseth Solomon K,
    4. Minor JK,
    5. Thompson-Schill SL
    (2003) Role of mental imagery in a property verification task: fMRI evidence for perceptual representations of conceptual knowledge. Cogn Neuropsychol 20:525–540. https://doi.org/10.1080/02643290244000257 pmid:20957583
    OpenUrlCrossRefPubMed
  22. ↵
    1. Kiefer M
    (2005) Repetition-priming modulates category-related effects on event-related potentials: further evidence for multiple cortical semantic systems. J Cogn Neurosci 17:199–211. https://doi.org/10.1162/0898929053124938 pmid:15811233
    OpenUrlCrossRefPubMed
  23. ↵
    1. Kleiner M,
    2. Brainard D,
    3. Pelli D,
    4. Ingling A,
    5. Murray R,
    6. Broussard C
    (2007) What's new in Psychtoolbox-3. Perception 36:1–16.
    OpenUrlCrossRefPubMed
  24. ↵
    1. Kok P,
    2. Jehee JFM,
    3. de Lange FP
    (2012) Less is more: expectation sharpens representations in the primary visual cortex. Neuron 75:265–270. https://doi.org/10.1016/j.neuron.2012.04.034 pmid:22841311
    OpenUrlCrossRefPubMed
  25. ↵
    1. Kourtzi Z,
    2. Kanwisher N
    (2001) Representation of perceived object shape by the human lateral occipital complex. Science 293:1506–1509. https://doi.org/10.1126/science.1061133 pmid:11520991
    OpenUrlAbstract/FREE Full Text
  26. ↵
    1. Kristjánsson A,
    2. Vuilleumier P,
    3. Schwartz S,
    4. Macaluso E,
    5. Driver J
    (2007) Neural basis for priming of pop-out during visual search revealed with fMRI. Cereb Cortex 17:1612–1624. https://doi.org/10.1093/cercor/bhl072 pmid:16959868
    OpenUrlCrossRefPubMed
  27. ↵
    1. Kumar S,
    2. Kaposvari P,
    3. Vogels R
    (2017) Encoding of predictable and unpredictable stimuli by inferior temporal cortical neurons. J Cogn Neurosci 29:1445–1454. https://doi.org/10.1162/jocn_a_01135 pmid:28387590
    OpenUrlCrossRefPubMed
  28. ↵
    1. Lakens D
    (2013) Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs. Frontiers in psychology 4:863.
    OpenUrl
  29. ↵
    1. Langner O,
    2. Dotsch R,
    3. Bijlstra G,
    4. Wigboldus DHJ,
    5. Hawk ST,
    6. van Knippenberg A
    (2010) Presentation and validation of the Radboud Faces Database. Cogn Emot 24:1377–1388. https://doi.org/10.1080/02699930903485076
    OpenUrlCrossRef
  30. ↵
    1. Loued-Khenissi L,
    2. Pfeuffer A,
    3. Einhäuser W,
    4. Preuschoff K
    (2020) Anterior insula reflects surprise in value-based decision-making and perception. Neuroimage 210:116549. https://doi.org/10.1016/j.neuroimage.2020.116549 pmid:31954844
    OpenUrlCrossRefPubMed
  31. ↵
    1. Lupyan G,
    2. Ward EJ
    (2013) Language can boost otherwise unseen objects into visual awareness. Proc Natl Acad Sci U S A 110:14196–14201. https://doi.org/10.1073/pnas.1303312110 pmid:23940323
    OpenUrlAbstract/FREE Full Text
  32. ↵
    1. Meyer T,
    2. Olson CR
    (2011) Statistical learning of visual transitions in monkey inferotemporal cortex. Proc Natl Acad Sci U S A 108:19401–19406. https://doi.org/10.1073/pnas.1112895108 pmid:22084090
    OpenUrlAbstract/FREE Full Text
  33. ↵
    1. Mumford JA,
    2. Turner BO,
    3. Ashby FG,
    4. Poldrack RA
    (2012) Deconvolving BOLD activation in event-related designs for multivoxel pattern classification analyses. Neuroimage 59:2636–2643. https://doi.org/10.1016/j.neuroimage.2011.08.076 pmid:21924359
    OpenUrlCrossRefPubMed
  34. ↵
    1. Pelli DG
    (1997) The VideoToolbox software for visual psychophysics: transforming numbers into movies. Spatial Vis 10:437–442. https://doi.org/10.1163/156856897X00366
    OpenUrlCrossRef
  35. ↵
    1. Puri AM,
    2. Wojciulik E,
    3. Ranganath C
    (2009) Category expectation modulates baseline and stimulus-evoked activity in human inferotemporal cortex. Brain Res 1301:89–99. https://doi.org/10.1016/j.brainres.2009.08.085 pmid:19747463
    OpenUrlCrossRefPubMed
  36. ↵
    1. Rao RP,
    2. Ballard DH
    (1999) Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nat Neurosci 2:79–87. https://doi.org/10.1038/4580
    OpenUrlCrossRefPubMed
  37. ↵
    1. Richter D,
    2. de Lange FP
    (2019) Statistical learning attenuates visual activity only for attended stimuli. Elife 8:e47869. https://doi.org/10.7554/eLife.47869
    OpenUrlCrossRef
  38. ↵
    1. Richter D,
    2. Ekman M,
    3. de Lange FP
    (2018) Suppressed sensory response to predictable object stimuli throughout the ventral visual stream. J Neurosci 38:7452–7461. https://doi.org/10.1523/JNEUROSCI.3421-17.2018 pmid:30030402
    OpenUrlAbstract/FREE Full Text
  39. ↵
    1. Richter D,
    2. Heilbron M,
    3. de Lange FP
    (2022) Dampened sensory representations for expected input across the ventral visual stream. Oxford Open Neuroscience 1:kvac013. https://doi.org/10.1093/oons/kvac013
    OpenUrl
  40. ↵
    1. Schwiedrzik CM,
    2. Freiwald WA
    (2017) High-level prediction signals in a low-level area of the macaque face-processing hierarchy. Neuron 96:89–97.e4. https://doi.org/10.1016/j.neuron.2017.09.007 pmid:28957679
    OpenUrlCrossRefPubMed
  41. ↵
    1. Stein T,
    2. Peelen MV
    (2015) Content-specific expectations enhance stimulus detectability by increasing perceptual sensitivity. J Exp Psychol Gen 144:1089–1104. https://doi.org/10.1037/xge0000109 pmid:26460783
    OpenUrlCrossRefPubMed
  42. ↵
    1. Summerfield C,
    2. de Lange FP
    (2014) Expectation in perceptual decision making: neural and computational mechanisms. Nat Rev Neurosci 15:745–756. https://doi.org/10.1038/nrn3838 pmid:25315388
    OpenUrlCrossRefPubMed
  43. ↵
    1. Thorat S,
    2. Proklova D,
    3. Peelen MV
    (2019) The nature of the animacy organization in human ventral temporal cortex. Elife 8:e47142. https://doi.org/10.7554/eLife.47142
    OpenUrl
  44. ↵
    1. Todorovic A,
    2. de Lange FP
    (2012) Repetition suppression and expectation suppression are dissociable in time in early auditory evoked fields. J Neurosci 32:13389–13395. https://doi.org/10.1523/JNEUROSCI.2227-12.2012 pmid:23015429
    OpenUrlAbstract/FREE Full Text
  45. ↵
    1. Turk-Browne NB,
    2. Jungé J,
    3. Scholl BJ
    (2005) The automaticity of visual statistical learning. J Exp Psychol Gen 134:552–564. https://doi.org/10.1037/0096-3445.134.4.552 pmid:16316291
    OpenUrlCrossRefPubMed
  46. ↵
    1. van der Walt S,
    2. Colbert SC,
    3. Varoquaux G
    (2011) The NumPy array: a structure for efficient numerical computation. Comput Sci Eng 13:22–30. https://doi.org/10.1109/MCSE.2011.37
    OpenUrlCrossRefPubMed
  47. ↵
    1. Walsh KS,
    2. McGovern DP,
    3. Clark A,
    4. O'Connell RG
    (2020) Evaluating the neurophysiological evidence for predictive processing as a model of perception. Ann N Y Acad Sci 1464:242–268. https://doi.org/10.1111/nyas.14321 pmid:32147856
    OpenUrlCrossRefPubMed
  48. ↵
    1. Weilnhammer V,
    2. Fritsch M,
    3. Chikermane M,
    4. Eckert A-L,
    5. Kanthak K,
    6. Stuke H,
    7. Kaminski J,
    8. Sterzer P
    (2021) An active role of inferior frontal cortex in conscious experience. Curr Biol 31:2868–2880.e8. https://doi.org/10.1016/j.cub.2021.04.043 pmid:33989530
    OpenUrlCrossRefPubMed
Back to top

In this issue

The Journal of Neuroscience: 43 (20)
Journal of Neuroscience
Vol. 43, Issue 20
17 May 2023
  • Table of Contents
  • Table of Contents (PDF)
  • About the Cover
  • Index by author
  • Masthead (PDF)
Email

Thank you for sharing this Journal of Neuroscience article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
Conceptual Associations Generate Sensory Predictions
(Your Name) has forwarded a page to you from Journal of Neuroscience
(Your Name) thought you would be interested in this article in Journal of Neuroscience.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Print
View Full Page PDF
Citation Tools
Conceptual Associations Generate Sensory Predictions
Chuyao Yan, Floris P. de Lange, David Richter
Journal of Neuroscience 17 May 2023, 43 (20) 3733-3742; DOI: 10.1523/JNEUROSCI.1874-22.2023

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Respond to this article
Request Permissions
Share
Conceptual Associations Generate Sensory Predictions
Chuyao Yan, Floris P. de Lange, David Richter
Journal of Neuroscience 17 May 2023, 43 (20) 3733-3742; DOI: 10.1523/JNEUROSCI.1874-22.2023
Twitter logo Facebook logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • Abstract
    • Introduction
    • Materials and Methods
    • Results
    • Discussion
    • Footnotes
    • References
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF

Keywords

  • conceptual associations
  • expectation suppression
  • perception
  • predictive processing

Responses to this article

Respond to this article

Jump to comment:

No eLetters have been published for this article.

Related Articles

Cited By...

More in this TOC Section

Research Articles

  • Threonine-53 Phosphorylation of Dopamine Transporter Dictates κ-Opioid Receptor-Mediated Locomotor Suppression, Aversion, and Cocaine Reward
  • Developmental Olfactory Dysfunction and Abnormal Odor Memory in Immune-Challenged Disc1+/− Mice
  • Functional Roles of Gastrin-Releasing Peptide-Producing Neurons in the Suprachiasmatic Nucleus: Insights into Photic Entrainment and Circadian Regulation
Show more Research Articles

Behavioral/Cognitive

  • Attention Alters Population Spatial Frequency Tuning
  • Complex Impact of Stimulus Envelope on Motor Synchronization to Sound
  • The Molecular Substrates of Second-Order Conditioned Fear in the Basolateral Amygdala Complex
Show more Behavioral/Cognitive
  • Home
  • Alerts
  • Follow SFN on BlueSky
  • Visit Society for Neuroscience on Facebook
  • Follow Society for Neuroscience on Twitter
  • Follow Society for Neuroscience on LinkedIn
  • Visit Society for Neuroscience on Youtube
  • Follow our RSS feeds

Content

  • Early Release
  • Current Issue
  • Issue Archive
  • Collections

Information

  • For Authors
  • For Advertisers
  • For the Media
  • For Subscribers

About

  • About the Journal
  • Editorial Board
  • Privacy Notice
  • Contact
  • Accessibility
(JNeurosci logo)
(SfN logo)

Copyright © 2025 by the Society for Neuroscience.
JNeurosci Online ISSN: 1529-2401

The ideas and opinions expressed in JNeurosci do not necessarily reflect those of SfN or the JNeurosci Editorial Board. Publication of an advertisement or other product mention in JNeurosci should not be construed as an endorsement of the manufacturer’s claims. SfN does not assume any responsibility for any injury and/or damage to persons or property arising from or related to any use of any material contained in JNeurosci.