Abstract
Category learning, learning to sort a set of stimuli into categories or groups, can induce category biases in perception such that items in the same category are perceived as more similar than items from different categories. To what degree category bias develops when learning goals emphasize individuation of each stimulus and whether the bias emerges spontaneously during learning itself rather than in response to task demands is unclear. Here, we used functional MRI (fMRI) during encoding to test for category biases in neural representations of individual stimuli during learning. Human participants (males and females) encountered face-blend stimuli with unique first names and shared family names that indicated category membership. Participants were instructed to learn the full name for each face. Neural pattern classification and pattern similarity analyses were used to track category information in the brain. Results showed that stimulus category could be decoded during encoding across many frontal, parietal, and occipital regions. Furthermore, two stimuli from the same category were represented more similarly in the prefrontal cortex than two stimuli from different categories equated for physical similarity. These findings illustrate that a mere presence of category label can bias neural representations spontaneously during encoding to emphasize category-relevant information, even in the absence of explicit categorization demands and when category-irrelevant information remains relevant for task goals.
SIGNIFICANCE STATEMENT Entities belonging to the same category are perceived as being more similar than entities belonging to different categories. Here, we show that neural representations highlighting category-relevant information form spontaneously during encoding. Notably, the presence of a category label led to neural category bias although participants focused on remembering individual stimuli and category-irrelevant stimulus features remained important for explicit task goals. These results may inform our understanding of bias in general and suggest that bias may emerge when category information is present even when one's explicit focus is on individuals.
- categorization
- category bias
- multivoxel pattern analysis
- pattern similarity analysis
- perceived similarity
Introduction
The ability to link details across our varied experiences and organize them into meaningful clusters of information is an important aspect of cognition. Concepts and categories are the basic building blocks of such organization. While we often cluster entities into categories based on their perceptual properties, the relationship between concepts and perception is mutual: our conceptual knowledge biases perception toward category-relevant features, to minimize within-category differences while accentuating between-category differences (Goldstone and Hendrickson, 2010). Category bias in perception has been described for real-world categories, such as speech categories or colors (Liberman et al., 1957; Gilbert et al., 2006), but can also be induced in the laboratory through category learning tasks (Goldstone, 1994; Beale and Keil, 1995; Kurtz, 1996; Goldstone et al., 2001).
Notably, we recently showed that the presence of a category label can induce a category bias, even when category-irrelevant stimulus features remain task-relevant (Ashby et al., 2020). Instead of a traditional category learning task, participants learned face-full name associations for face-blend stimuli. Stimuli were created by morphing together never-studied “parent” faces resulting in increased physical similarity for faces that shared a parent. Some faces with a shared parent were assigned a shared family name (belonged to the same category) while other faces with a shared parent had different family names, allowing us to dissociate the effects of physical similarity from category membership. Each stimulus was paired with a unique first name and participants learned a full name for each face. After learning, participants showed a category bias in perceptual similarity ratings: faces with the same last name were rated as more similar than faces that were physically equally similar but had different family names. The category bias in ratings predicted performance on a subsequent categorization test of never-studied face-blends. This indicates that a category label may guide organization of memory to facilitate the formation of generalizable knowledge and bias subsequent perception even when task goals emphasize memory for individual stimuli and all features are relevant to explicit task goals.
Our experiences overlap in content, allowing us to connect pieces of information experienced at different times to infer new information and generalize to novel situations. Whether related content is organized and linked in memory spontaneously during encoding (Shohamy and Wagner, 2008; Zeithamova et al., 2012) or rather in response to task demands at retrieval (Banino et al., 2016; Carpenter and Schacter, 2017, 2018) is debated (for review, see Zeithamova and Bowman, 2020). In our behavioral study (Ashby et al., 2020), we found a generalization-predicting category bias in perception emerged immediately after learning of face-full name pairs, before an explicit generalization test. This indicated that participants formed category representations spontaneously during encoding rather than in response to generalization demands, although the contribution of a strategic decision to rate faces within the same family as more similar is not possible to rule out.
In the current study, we aimed to use neural evidence to test the notion of spontaneous organization of related information more directly during encoding itself. To measure representational bias during learning, and under conditions that do not create explicit generalization task demands, we used pattern-information analyses of functional MRI (fMRI) to measure neural representations of individual face stimuli during encoding of face-full name associations with overlapping family names. Using multivoxel pattern classification analysis (MVPA) and neural pattern similarity analysis, we tested for the presence of category information and category bias in neural representations in regions previously implicated in memory generalizations and across the whole brain. Critically, the novel task design also allowed us to determine whether the mere presence of a category label would lead to a spontaneous formation of category representations during encoding, even when the task goals emphasized individuation of separate exemplars.
Materials and Methods
Participants
Forty-four healthy participants were recruited from the University of Oregon and surrounding community via the university SONA research system and community fliers. Participants received monetary compensation for their participation ($10/h outside the scanner and $20/h inside the scanner). All participants provided written informed consent, were right-handed, native English speakers, and were screened for neurologic conditions and medications known to affect brain function. Experimental procedures were approved by Research Compliance Services at the University of Oregon. Four participants were excluded from analyses: two for movement in excess of 1.5-mm frame-wise displacement within a run, one because of operator error resulting in poor data quality, and one for scanning interruption due to headache. The remaining sample of 40 participants (22 female, 18 male; age 18–30 years; Mage = 21.33, SDage = 2.92) are reported in all analyses.
Stimuli
Stimuli were grayscale images of blended faces that we previously developed and made publicly available (OSF repository: https://osf.io/e8htb/; see also Ashby et al., 2020). The stimulus set comprises of a pool of 20 face photographs (so called “parent” faces, never shown to the participants in our study) and all 190 pairwise computer blends of those 20 parent faces.
Training stimuli
To create the training blended faces, six parent faces were randomly chosen for each participant, three of them assigned as category-relevant and three assigned as category-irrelevant. Each of the three category-relevant parent faces were individually morphed with each of the three category-irrelevant parent faces, with equal weight given to each parent face (50/50 blend; see Fig. 1). The resultant nine face-blends were then used as stimuli in the learning task, with faces sharing a parent face being physically more similar than faces that did not share a parent face. Faces that shared a category-relevant parent also shared a family name (belonged to the same category) while faces that shared a category-irrelevant parent had different family names. Thus, using blended faces provided us with realistic-looking face stimuli while allowing us to control within-category and between-category similarity.
Structure of face-blend stimuli. Parent faces on the leftmost side are designated “category relevant parents” as these parents determined family membership, Miller, Wilson, or Davis, during learning, recognition, and generalization. Parent faces across the top are designated “category-irrelevant parents” as these parents introduced physical similarity across families but did not determine categories. Three category-irrelevant parents were used for learning. The rightmost three category-irrelevant parents are a subset of new faces used for generalization. Parent faces were never viewed by participants, only the resulting blended faces. The face blending procedure produced pairs of faces that shared a category-relevant parent and belonged to the same family (shared parent – same family name; example indicated with dark gray box), pairs of faces that shared a category-irrelevant parent and belonged to different families (shared parent – different family name; example indicated with medium gray box). Nonadjacent pairs did not share a parent and were not related (example indicated with light gray boxes). Figure is adapted with permission from Ashby et al. (2020). Eyes were obstructed for publication but were visible to the participants.
Because pilot data indicated that some parent faces were more distinct and thus more prominent in the resulting blend while other faces were more average and thus less prominent in the resulting blend, we took two additional steps not implemented in our prior work to better equate prelearning perceived similarity of the face-blends that shared a parent within and between categories. First, we limited the pool of possible parent faces for the creation of the training stimulus set to 10 faces (from the full set of 20) that were of intermediate distinctiveness based on an item analysis of prelearning similarity rating data that we collected through pilot testing and previously published studies (see Ashby et al., 2020; Bowman et al., 2021). Second, we implemented a yoking procedure between subjects so that two participants were assigned the same parent faces with reversed category-relevant and category-irrelevant parent designation. This ensured that if one parent face happened to have more salient features, it would be equally frequently assigned as a category-relevant parent or a category-irrelevant parent.
Test stimuli
In addition to the nine training stimuli, 52 new face-blend stimuli were created for subsequent old/new recognition test and a surprise generalization test. To create new test stimuli, the three category-relevant parent faces were blended with 14 new parent faces (all parent faces not used for training stimuli) resulting in 14 new face-blends per category.
Experimental design
The experiment consisted of the following phases (Fig. 2): initial exposure (passive viewing), prelearning similarity ratings, observational learning of face-full name associations (scanned), postlearning similarity ratings, cued-recall of face-name associations, old/new recognition test (scanned), and category generalization (scanned). Only the fMRI data from the observational paired-associate learning phase were analyzed for the purpose of the current paper, testing for the formation of category-biased neural representations when task goals emphasize face-specific information.
Full imaging procedure. Participants passively viewed the nine training faces and rated the subjective similarity of all 36 pairwise comparisons of the training faces before entering the scanner. Face-full name learning was scanned and completed in four runs. Anatomical scans were collected during postlearning similarity ratings to minimize time spent in scanner. Cued name recall was completed with participants communicating their answers to researchers verbally through the scanner intercom system. The recognition phase was scanned and consisted of 51 trials (9 old and 42 new faces) split into three runs. The categorization phase was also scanned and used the same faces as the recognition phase and was also split into three runs. Only the fMRI data from the learning phase are considered in the current manuscript.
Passive viewing
Before entering the scanner, participants first passively viewed each of the nine training stimuli individually, once in random order without any labels and without making any responses. Face-blends were shown for 3 s with a 1-s interstimulus interval (ISI). This was done to familiarize participants with the stimuli, minimize novelty effects during the learning phase, and provide participants with an estimate of the degree of similarity between all faces before collecting the prelearning perceptual similarity ratings.
Prelearning similarity ratings
Before entering the scanner, participants rated the subjective similarity of all pairs of training faces. This allowed us to verify that participants were sensitive to the inherent similarity structure among faces introduced by the blending procedure. All possible 36 pairwise comparisons of the nine training faces were presented and participants rated the subjective similarity of the two faces on a scale from one to six (1 = the two faces appeared very dissimilar, 6 = the two faces appeared very similar). The face pairs and the rating scale were presented simultaneously for 5 s with a 1-s ISI. For subsequent analyses face pairs were binned into three conditions depending on whether they (1) shared a parent and a family name, (2) shared a parent but did not share a family name, or (3) did not share a parent (see example pairs in Fig. 1). Because there are nine pairs of faces that share a relevant parent, nine pairs of faces that share an irrelevant parent, and 18 pairs of faces that do not share a parent, we presented the 9 + 9 pairs of faces with shared parents twice, with counterbalanced left-right position of the two faces.
Observational learning of face-full name associations (scanned)
Participants were next placed in the MRI machine and scanned during learning of the face-full name associations across four training runs. During learning, participants studied a face-full name pair for 3 s and then made a prospective memory judgment on a scale from one to four (1 = definitely will not remember, 4 = definitely will remember) for 2 s. Prospective memory judgments were included to encourage participant engagement with the observational task and were not considered further. All trials were separated by a 3-s ISI. Each face-full name pair was studied three times per run for a total of 12 exposures across all of learning. Family names (Miller, Wilson, Davis) were shared across faces that shared a category-relevant parent face. Nine unique first names (Brad, John, Paul, Steve, Tyler, Andy, Ryan, Kyle, Eric) were randomly assigned to each face. Participants were instructed to learn each individual's full name and repetition of family names across faces or the presence of any category structure was not explicitly emphasized to participants. This structure and instruction directed participants to differentiate individual faces, even within the same family, while also providing an opportunity to form links between related faces in service of memory generalization.
Postlearning similarity ratings
Postlearning perceived similarity ratings were collected in the scanner while anatomic data were collected (see fMRI data acquisition below). Timing and presentation of face-pairs was identical to the prelearning similarity rating procedure, in a new random order.
Cued recall of face-name associations
To assess learning success, participants completed cued-recall of the face-full name associations. During this recall phase, participants viewed each of the nine training faces individually for as much time as needed while still lying in the scanner (no MRI data were collected). Participants were instructed to vocalize the first and last name of each face and the researcher, listening through the scanner intercom system, recorded their responses. Trials were advanced by the researcher at the request of the participant. Participants were encouraged to make their best guess as to the full names of the faces even if they were not confident in their memory.
Recognition (scanned)
An old/new recognition test was also used as another learning performance metric for the individual faces. In addition to the nine training faces, participants were exposed to 42 never-seen faces that consisted of the 14 new blends of each of the three category-relevant parent faces. Participants were asked to select via button press whether or not the face presented was old, meaning it was a face they had already studied while in the scanner, or new. No feedback was given. The 51 trials were split into three runs of 17 trials each (each run contained 14 new and three old faces) and each trial was presented for 4 s with an 8-s ISI. Imaging data from the recognition phase are not considered further in the current report.
Generalization (scanned)
Lastly, category knowledge was directly tested using categorization of old (training) and new face blends. New face blends were the same as those used in the recognition phase. Participants were asked to select via button press the family name for each face from the three options (Miller, Wilson, Davis) presented on the screen. No feedback was provided. The 51 trials were split into three runs of 17 trials each (14 new and 3 old faces) with 4-s trials and an 8-s ISI. Imaging data from the categorization phase are not considered further in the current report.
fMRI data acquisition
Imaging data were collected using a 3T Siemens MAGNETOM Skyra scanner at the University of Oregon Lewis Center for Neuroimaging using a 32-channel head coil. Foam padding was used around the head to minimize head motion. The scanning session started with a localizer SCOUT sequence followed by four functional runs of the learning task, and three functional runs each of the recognition and generalization tasks using a multiband gradient echo pulse sequence [TR = 2000 ms; TE = 26 ms; flip angle = 90°; matrix size = 100 × 100; 72 contiguous slices oriented 15° off the anterior commissure-posterior commissure line to reduce prefrontal signal dropout; interleaved acquisition; FOV = 200 mm; voxel size = 2.0 × 2.0 × 2.0 mm; generalized autocalibrating partially parallel acquisition (GRAPPA) factor = 2]. For each task run, 110 volumes were collected for the learning task and 104 volumes each for the recognition and categorization tasks. Only data from the learning phase are presented here. A standard high-resolution T1-weighted MPRAGE anatomic image (TR = 2500 ms; TE = 3.43 ms; TI = 1100 ms; flip angle = 7°; matrix size = 256 × 256; 176 contiguous slices; FOV = 256 mm; slice thickness = 1 mm; voxel size = 1.0 × 1.0 × 1.0 mm; GRAPPA factor = 2) and a custom anatomic T2 coronal image (TR = 13 520 ms; TE = 88 ms; flip angle = 150°; matrix size = 512 × 512; 65 contiguous slices oriented perpendicularly to the main axis of the hippocampal body; interleaved acquisition; FOV = 220 mm; voxel size = 0.4 × 0.4 × 2 mm; GRAPPA factor = 2) were collected to facilitate anatomic localization of the neural signals.
Preprocessing and single-trial modeling
Raw dicom images were converted to Nifti format using MRIcron's (https://people.cas.sc.edu/rorden/mricron/index.html) dcm2nii function. Functional, behavioral, and anatomic data were organized in the Brain Imaging Data Structure (BIDS) format for public dissemination on OpenNeuro (https://doi.org/10.18112/openneuro.ds003851.v1.0.0). Functional images were entered into a single-trial fMRI Expert Analysis Tool (FEAT) model from FSL version 6 (www.fmrib.ox.ac.uk/fsl). First, the functional images were skull stripped using the Brain Extraction Tool (BET) and corrected for within-run motion using MCFLIRT by realigning all volumes to the middle volume. Next, we applied high-pass temporal filtering (60 s) and minimal spatial smoothing using a 2-mm full-width at half-maximal (FWHM) Gaussian kernel. No slice timing correction was applied.
Individual trials were modeled using the general linear model (GLM) including nuisance regressors representing the six, standard motion regressors for rotational and translational motion. A regressor for the individual trial onset times for the training was included in each model and events were modeled with durations of 3 s (the period of time the face-name pair was on the screen before the prospective memory judgment). This was then convolved with the hemodynamic response function as implemented in FSL (γ function: phase = 0 s, SD = 3 s, mean lag time = 6 s) resulting in β weight estimations for each individual trial, for each functional run of the training task. We next concatenated the resultant β images for each trial across time creating a single betaseries image for each of the four functional runs. Across-run realignment was then applied to the betaseries images for each run using Advanced Normalization Tools (ANTs; http://stnava.github.io/ANTs/) with the first volume of the fourth run of the training task used as the reference volume. The first volumes of all other task runs were registered to the reference volume and the resulting transformation was applied to the concatenated betaseries images. Lastly, we concatenated all the realigned betaseries images across runs for pattern analyses.
Regions of interest (ROIs)
In addition to whole-brain searchlight analyses (described in more detail below), we conducted ROI analyses. Three ROIs were selected for their hypothesized roles in memory generalization. Ventromedial prefrontal cortex (VMPFC) was included because of its well established role in supporting memory integration and schema memory (Zeithamova et al., 2012; van Kesteren et al., 2013; Schlichting et al., 2014; Zeithamova and Bowman, 2020). Middle temporal gyrus (MTG) was included because of its role in semantic and gist memory (Mummery et al., 2000; Dennis et al., 2008; Turney and Dennis, 2017) and our recent findings of its role in category learning (Bowman and Zeithamova, 2018). Finally, the anterior portion of the hippocampus (AHIP) was included based on recent proposals that AHIP (ventral hippocampus in rodents) may be uniquely involved in forming coarser, more generalized representations (for review, see Poppenk et al., 2013) and our prior finding of generalized concept representations in AHIP (Bowman and Zeithamova, 2018; Bowman et al., 2020).
Three additional ROIs were included as control regions. Because the face-blend stimuli share physical similarity both within and across category boundaries, we chose two visual ROIs that we expected would be sensitive to the physical similarity between face-blends: lateral occipital cortex (LO) and the posterior fusiform gyrus (PFUS). Sensitivity to physical similarity in these visual regions would manifest as decoding both category-relevant and category-irrelevant information. However, even these visual regions may show category-related enhancement, such as through top-down attentional modulation from the prefrontal cortex (Folstein et al., 2013), which would manifest as increased decoding of category-relevant information. We also explored the posterior hippocampus (PHIP) to test for an anterior-posterior dissociation within the hippocampus.
ROIs were defined in each individual participant's native space using the cortical parcellation and subcortical segmentation routines from FreeSurfer version 6 (https://surfer.nmr.mgh.harvard.edu/) of the T1-weighted MPRAGE anatomic image. Bilateral masks for each ROI were created by collapsing together across hemispheres. The VMPFC ROI was defined as the FreeSurfer medial orbitofrontal cortex label. To obtain separate AHIP and PHIP regions, we divided the FreeSurfer hippocampal ROI at the middle slice. In the event that there were an odd number of hippocampal slices for a given participant, the middle slice was assigned to the posterior hippocampus. All ROI analyses were conducted in native space of each participant.
Behavioral statistical analyses
Memory performance for faces and names
To index participants' memory for face-name pairs, we recorded the proportion of first names and the proportion of last names correctly recalled during the cued recall test. Additionally, we used a measure of corrected hit rate (hits – false alarms) from the recognition task to determine how well participants were able to recognize the individual faces encountered during learning. Recognition performance was evaluated using a one-sample t test comparing corrected hit rate against zero.
Categorization performance
Generalization performance was measured as the accuracy (percent correct) for categorizing new face blends during the surprise categorization task. We also recorded percent correct categorization of the training faces. One-sample t tests compared categorization performance against chance performance (33.3% for three categories), separately for training faces and for new stimuli. A paired sample t test was used to compare categorization performance for the training faces against categorization performance for the new faces.
Similarity ratings
Of main interest from the similarity ratings task was the category bias in perception (similarity ratings for face pairs that shared a parent and a family name minus similarity ratings for face pairs that shared a parent but had different family names) from the postlearning similarity ratings (Ashby et al., 2020). First, we examined perceptual similarity ratings separately for the prelearning and postlearning phases. Within each phase we compared mean similarity ratings for face pairs of each type (shared parent-same family name, shared parent-different family name, not related) using repeated-measures ANOVA. To examine learning-related changes we also compared across phases using a 2 × 3 [time point (prelearning, postlearning) × pair-type (shared parent-same family name, shared parent-different family name, not related)] repeated-measures ANOVA. A Greenhouse–Geisser correction for degrees of freedom (denoted as GG) was used wherever Mauchly's test indicated there was a violation of the assumption of sphericity in the data.
Lastly, we used a Pearson correlation to determine whether the postlearning category bias in similarity ratings predicts subsequent generalization of category knowledge to new instances (see also Ashby et al., 2020). To confirm that individual differences in prelearning similarity ratings did not account for this relationship, we also used a multiple regression including both the prelearning and postlearning category biases in the model as predictors of generalization success.
Neuroimaging ROI statistical analyses
fMRI classification of category-relevant and category-irrelevant information
To test for the presence of category information in the neural patterns representing individual face stimuli, we first used MVPA. Of main interest was measuring to what degree it is possible to decode the category-relevant parent structure (i.e., which family the stimulus belonged to) from the patterns of activation in response to each face in each ROI (and across the brain using a searchlight approach described below). We also tested to what degree it is possible to decode the category-irrelevant parent structure as a control for physical similarity-driven classification. Each face-blend stimulus contained features shared with other face-blend stimuli with whom it shared the same parent, whether category-relevant or category-irrelevant. However, it belonged to the same family category only with faces with whom it shares the same category-relevant parent. Thus, we reasoned that if both the category-relevant and category-irrelevant information are decodable in a given region, that would indicate that the region is sensitive to the physical similarity shared among stimuli. In contrast, if a classifier can decode the category-relevant but not the category-irrelevant information in a region, the region primarily represents category information rather than physical similarity.
We predicted reliable classification of category-relevant (but not category-irrelevant) parent structure in regions known to support memory generalization. Further, we predicted reliable classification of both category-relevant and category-irrelevant information in visual regions as they should be sensitive to the physical similarity of the faces regardless of the learned category information. To test these predictions, we used classifier-based PyMVPA (http://www.pymvpa.org; see also Hanke et al., 2009) using two cross-validation approaches: a traditional leave-one-run-out cross-validation and a leave-one-parent-out cross-validation. Two classification analyses were run using each cross-validation approach, one to classify the category-relevant parent faces and one to classify the category-irrelevant parent faces among the nine training faces. For the leave-one-run-out cross-validation procedure, a support vector machine (SVM) classifier was iteratively trained to classify the category-relevant or category-irrelevant parents on three out of four training runs and tested on the left-out run on each iteration. For the leave-one-parent-out cross-validation procedure, a SVM classifier was iteratively trained to classify the category-relevant or category-irrelevant parent on two out of three blends per parent and tested on the left-out blend. For example, for decoding of category-relevant information, cross-validation proceeded by leaving one category-irrelevant parent face out on each iteration, such as training to classify Miller versus Wilson versus Davis among the six training faces created by blending the category-relevant faces with first two irrelevant parent faces, and testing on classifying the blends of category-relevant faces with the third irrelevant parent face. This allowed us to measure whether patterns of activity within a family allowed us to classify family membership of the left-out family member. Similarly, for decoding the category-irrelevant information, cross-validation proceeded by leaving one category-relevant parent face out and decoding whether patterns of activity across faces that shared an irrelevant parent allowed us to classify the left-out individual that also shared that irrelevant parent. For the decoding analyses, classifier success was tested to see whether performance was greater than theoretical chance performance (33.3% for three categories or three irrelevant parent faces) using one-tailed, one-sample t tests. Of main interest was whether patterns in a given region (1) differentiate among both relevant and irrelevant parent faces, indicating sensitivity to physical similarity among faces with a shared parent; (2) differentiate among relevant but not irrelevant parent faces, indicating sensitivity to conceptual information; or (3) not differentiate among faces that share a parent. We supplemented these analyses with one-tailed, paired t tests to determine whether classification performance was greater for category-relevant information compared with category-irrelevant information. All t tests were corrected for multiple comparisons using the Bonferroni correction. Bayes factor BF10 (in favor of the alternative hypothesis) or BF01 (in favor of the null hypothesis) is provided in addition to the standard test statistics.
Neural pattern similarity representations of category information
Our second approach was to use neural pattern similarity analysis, also known as representational similarity analysis, to directly test for the existence of a category bias in neural representations (Kriegeskorte et al., 2008). Since pairs of faces that share a category-relevant parent and pairs that share a category-irrelevant parent are equated for physical similarity, greater neural pattern similarity for faces that share a category-relevant parent would demonstrate that learning altered neural representations to reflect a category bias. As with the MVPA approach, we predicted generalization regions, but not necessarily the visual regions, would demonstrate this neural category bias. To test for the category-biased representations, we first measured the degree of neural pattern similarity using a Pearson correlation within each ROI for all pairs of trials that (1) shared a parent and also shared a family name and (2) shared a parent and had different family names. The resulting R values were Fisher z-transformed to permit statistical analyses. For each participant and ROI, we then calculated the category bias in neural pattern similarity by subtracting the mean pattern similarity for the two types of pairs (shared parent same family name, shared parent different family name), and dividing the difference by their variability to quantify the category bias in terms of normalized distance Cohen's d (Zeithamova et al., 2017). The pattern of results remains the same when raw (not normalized) similarity differences are used. Category biases in neural representations for each hypothesized generalization ROI were then tested against zero using one-tailed, one-sample t tests, corrected for multiple comparisons.
Neuroimaging whole-brain statistical analyses
Because the anatomic ROI approach may be insufficient by either including uninformative voxels or excluding informative voxels, and because we were interested in how any potential category representations may be distributed across the brain, we also conducted multivoxel pattern-information analyses (classification and pattern similarity) across the entire brain using a searchlight approach (Kriegeskorte et al., 2006; Kriegeskorte and Bandettini, 2007) within a sphere (3-mm radius) iteratively swept across the entire brain using PyMVPA. Because of pitfalls associated with interpreting localization of results in searchlight approaches, and in line with recommendations from Etzel et al. (2013), we primarily sought to examine the presence of category representations across the brain and exercise caution when interpreting localization of category information to specific regions.
Searchlight classification of category-relevant and category-irrelevant information
For the classification analysis, the searchlight produced one searchlight decoding accuracy map for category-relevant and one for category-irrelevant parent information for each subject. Individual subject searchlight maps were then normalized to the standard MNI template space using ANTs. Transformations to standard space were calculated between each subject's reference volume (run four of training) and the standard template and then applied to the searchlight maps for category-relevant and category-irrelevant classification.
Next, individual subject maps in standard space were merged into two 4D maps (one for category-relevant and one for category-irrelevant) and smoothed (Gaussian kernel: 2 mm) in preparation for group-level statistics. To compute one-sample t tests on the merged images to statistically test which regions in the brain represented category-relevant and category-irrelevant information, we first subtracted theoretical chance performance (1/3) from each merged image and masked the images with the standard MNI template whole-brain mask. Statistics were computed using FSL Randomize and thresholded using cluster-extent thresholding (voxel t > 2.7, cluster p < 0.05) within the standard MNI gray-matter mask. We report the statistically significant clusters as well as the local maxima extracted from each cluster for tabulation of results.
Searchlight neural pattern similarity representations of category information
We tested for category-biased neural representations across the entire brain by running a neural pattern similarity searchlight analysis (i.e., representational similarity analysis) computing pairwise similarity patterns for all trials within each sphere and producing a searchlight map of category bias in face representation for each subject, using the same definition of category bias as described in the ROI analysis (normalized difference of neural similarity of faces that share a parent and last name and neural similarity of faces that share a parent but not last name). Searchlight maps were next normalized, merged, smoothed, and masked with the standard MNI whole-brain template as described above for the classification searchlight analysis. As with MVPA classification analysis, statistical inference was performed using one-sample t tests to test against the null (pattern similarity for relevant minus pattern similarity for irrelevant equal zero) using FSL randomize and cluster-extent thresholding (voxel t > 2.7, cluster p < 0.05) within a standard MNI gray-matter mask.
Results
Behavioral
Memory for faces and names
We first examined recall accuracy from the cued-recall task to assess how well participants stayed on task and learned the first and family names during the observational paired-associates learning. On average, participants were able to recall 58% of first names and 65% of family names, similar to our prior behavioral study (52% of first names, 65% of family names, see Ashby et al., 2020). As another measure of learning of individual faces, we used a corrected hit rate (hits – false alarms) from the recognition phase to account for unequal exposure to old (n = 9) and new training faces (n = 42). We found evidence for good recognition as the average corrected hit rate for participants was 79.5% (SD = 17%) which was well above zero (t(39) = 29.19, p < 0.001, d = 4.62). The hit rate was 89.1% (SD = 11.1%) and the false alarm rate was 9.6% (SD = 11.2%).
Categorization performance
Next, we examined performance for learning the category-relevant information by assessing categorization accuracy during the surprise categorization task. We examined accuracy separately for the training faces and the generalization faces. Participants correctly categorized 69% (SD = 21%) of the old (training) faces and 62% (SD = 18%) of the new faces. While categorization accuracy for the new faces was lower than for the training faces (t(39) = 3.19, p = 0.003, d = 0.505), it was still well above chance (both training and new faces, t(39) > 10.00, p < 0.001, d > 1.58). The successful categorization of the new faces into the appropriate family categories indicates that category information extracted during learning was successfully generalized.
Similarity ratings
Prelearning similarity ratings confirmed that participants were sensitive to the similarity structure among stimuli, introduced by the blending procedure (Fig. 3A). We found a significant main effect of pair type (F(2,78) = 96.18, p < 0.001,
Behavioral category bias. A, Average similarity ratings for faces that share a parent and family name, faces that only share a parent, and faces that do not share any parents before learning. B, Average similarity ratings for the same pairwise comparisons after learning. No significant category bias in perception was found averaged across subjects. C, Changes in similarity ratings from prelearning to postlearning. An overall significant decrease in perceived similarity for faces. D, Positive relationship between indirect (category bias in perception) and direct (categorization accuracy for new faces) measures of memory generalization. Error bars denote standard error of the mean.
Postlearning similarity ratings (Fig. 3B) also differed by pair type (F(1.67,65.13) = 91.93, p < 0.001,
A 2 × 3 [time point (prelearning, postlearning) × pair-type (shared parent-same family name, shared parent-different family name, not related)] repeated-measures ANOVA showed a significant main effect of time point (F(1,39) = 5.89, p = 0.020,
Although the overall effect of the category bias in postlearning similarity ratings was not significant, our prior work showed a postlearning category bias in perception that predicted generalization performance (Ashby et al., 2020; Bowman et al., 2021). Thus, we wanted to examine whether individual differences in the category-bias were still related to performance on the generalization task. As predicted, Pearson correlation showed a significant relationship such that larger postlearning category biases in perception were associated with better generalization performance during the categorization task (r(39) = 0.57, p < 0.001; Fig. 3D). The category bias in perceived similarity postlearning remained a significant predictor of subsequent generalization performance even when prelearning similarity ratings were considered (multiple regression: prelearning category bias b = 0.14, t(39) = 0.69, p = 0.49; postlearning category bias b = 0.47, t(39) = 2.36, p = 0.024). These results successfully replicate our previous finding that a category bias in perceived similarity ratings may be a useful indirect measure of category learning when explicit generalization demands are minimized.
MVPA pattern classification of category-relevant and category-irrelevant visual information
ROI analyses
Each training face-blend contained features that were both category-relevant and category-irrelevant. We predicted that putative generalization regions (VMPFC, MTG, AHIP) would classify category-relevant information but not category-irrelevant information, showing sensitivity to conceptual rather than perceptual information. We predicted that the visual control regions (PFUS, LO) would classify both category-relevant and category-irrelevant information, reflecting sensitivity to physical similarity, although category-relevant information may be further enhanced (Folstein et al., 2013, 2015).
To test these hypotheses, we used two different approaches for training and testing the SVM classifier. Our first approach used a standard, leave-one-run-out cross validation procedure. Our second approach used a leave-one-parent-out cross-validation procedure, which allowed us to test whether parent-related activation patterns generalize across their blends, such as category-related activation patterns generalizing across category members.
Leave-one-run-out cross-validation
MVPA classifier performance for decoding category-relevant and category-irrelevant information during learning in each of the a-priori ROIs using a leave-one-run-out cross-validation procedure is presented in Figure 4A. First, we examined classifier performance within putative generalization regions (Fig. 4A, left side). Significance for all t tests was determined by a Bonferroni adjusted α level of p = 0.0167 (α = 0.05 divided by three regions). One-sample t tests compared classifier accuracy for generalization regions against chance performance (33.3% for three categories) revealing significant decoding of category-relevant information in MTG (t(39) = 3.95, p < 0.001, d = 0.57, BF10 = 35), which remained significant after correcting for multiple comparisons. None of the putative generalization regions decoded category-irrelevant information (all t < 0.2, p > 0.45, d < 0.02, BF01 > 3.8 in favor of the null hypothesis). Category-relevant decoding in MTG was reliably greater than irrelevant parent decoding (t(39) = 2.31, p = 0.013, d = 0.37, BF10 = 3.56) and survived correction for multiple comparisons.
Pattern classification analyses within putative generalization and control ROIs. Mean cross-validated classifier accuracies for category-relevant (blue) and category-irrelevant (red) parent face decoding. Dashed line denotes chance (33.3% for three categories) A, Classifier was trained iteratively using a leave-one-run-out cross-validation procedure on three runs of data and tested on the left-out run of data. B, Classifier was trained iteratively on two out of three blends per parent using a leave-one-parent-out cross-validation procedure and tested on the left-out blend (either a family member for category-relevant decoding or a blend from a different family for category-irrelevant decoding). Error bars denote standard error of the mean. Star (*) denotes p < 0.05 (corrected).
Next, we examined classifier performance within the control regions (Fig. 4A, right side). We predicted that visual regions (LO, PFUS) would be sensitive to the physical similarity of the faces rather than the learned category information and thus would classify both category-relevant and category-irrelevant information. As PHIP has been shown to be involved in episodic memory, we did not have any specific predictions for the patterns of activity we may see in this region during a category-learning task with specificity goals. One-sample t tests compared classifier accuracy for the control regions against chance performance revealing significant decoding of category-relevant information in LO (t(39) = 5.31, p < 0.001, d = 0.79, BF10 = 3496) that survived correction for multiple comparisons. As predicted we also found significant decoding of category-irrelevant information in LO (t(39) = 3.64, p < 0.001, d = 0.52, BF10 = 33), indicating that visual cortex was sensitive to the physical similarity of the faces. Although category-relevant decoding in LO was numerically greater than category-irrelevant decoding, that difference did not reach statistical significance after correction for multiple comparisons (t(39) = 1.73, p = 0.046, d = 0.27, BF10 = 1.3).
Leave-one-parent-out cross-validation
MVPA classifier performance using leave-one-parent-out cross validation is presented in Figure 4B. Within the putative generalization regions (Fig. 4B, left side), we found significant classification of category-relevant information in MTG (t(39) = 3.56, p < 0.001, d = 0.56, BF10 = 61), and no significant classification of category-irrelevant information (all t < 1.4, p > 0.08, d < 0.22, BF01 > 1.3 in favor of the null hypothesis). Paired t tests comparing classifier performance for category-relevant and irrelevant information showed greater decoding accuracy for category-relevant information in MTG (t(39) = 2.52, p = 0.008, d = 0.40, BF10 = 5.5) and VMPFC (t(39) = 2.29, p = 0.014, d = 0.36, BF10 = 3.4) both of which remained significant after correction for multiple comparisons.
Within control regions (Fig. 4B, right side), we found significant decoding of category-relevant information in LO (t(39) = 3.97, p < 0.001, d = 0.63, BF10 = 179) and PFUS (t(39) = 3.27, p = 0.001, d = 0.52, BF10 = 29). We also found significant decoding of category-irrelevant information in LO (t(39) = 3.19, p = 0.002, d =0 0.50, BF10 = 24) and PFUS (t(39) = 2.91, p = 0.003, d = 0.46, BF10 = 13), with no significant difference in decoding accuracy for category-relevant versus irrelevant information (all t < 0.94, p > 0.18, d < 0.15, BF01 > 2.4 in favor of the null hypothesis).
Whole-brain MVPA classification searchlight analyses
Leave-one-run-out cross-validation
Whole-brain searchlight maps for decoding of category-relevant and category-irrelevant information across the learning phase are presented in Figure 5A. For category-relevant classification, we found four significant clusters, three of which were large clusters spanning many brain regions (Table 1). Notably, regions that classified category-relevant information were widespread and distributed across large portions of the frontal lobes, parietal lobes, occipital lobes, and the midline. In contrast, MVPA searchlight for category-irrelevant classification yielded only a single statistically significant cluster fully confined to the occipital pole (see Table 1; Fig. 5A). Most voxels within this cluster were also part of the category-relevant classification cluster (see magenta in occipital cortex in Fig. 5A), suggesting that this region was sensitive to physical similarity regardless of category membership. Direct contrast of the category-relevant versus category-irrelevant decoding did not yield any significant clusters that survived correction.
Learning phase searchlight MVPA results for leave-one-run out cross-validation
Whole-brain searchlight MVPA results for cross-validation via run and parent. A, MVPA searchlight maps for category-relevant (blue) and category-irrelevant (red) decoding using a leave-one-run out cross-validation procedure. Category-irrelevant decoding in occipital cortex largely overlapped with decoding for category-relevant information (magenta). B, Using the leave-one-parent-out procedure, category-irrelevant decoding in occipital and parietal lobes largely overlapped with decoding for category-relevant information (magenta). Animations fully displaying the pattern of results across the entire brain are available on OSF for both neural pattern classification analyses.
Leave-one-parent-out cross-validation
The whole-brain searchlight maps using the leave-one-parent-out cross-validation approach are displayed in Figure 5B. For category-relevant classification we found five significant clusters, including two large clusters spanning many brain regions (Table 2). Like the leave-one-run-out approach, regions decoding relevant information were widespread and included parietal, occipital and medial prefrontal regions, but were perhaps less robust in the lateral prefrontal cortex than the leave-one-run-out approach. In contrast to the leave-one-run-out approach described above, we also found five significant clusters that classified category-irrelevant information, including occipital, posterior parietal, and temporal regions. Again, direct contrast of the category-relevant versus category-irrelevant decoding did not reach corrected threshold.
Learning phase searchlight MVPA results for leave-one-parent out cross-validation
Neural pattern similarity representations of category bias
Pattern similarity ROI analysis
To directly test for category-bias in neural representations, we compared neural pattern similarity for faces that shared a category-relevant parent face to pairs of faces that shared a category-irrelevant parent face. In the hypothesized generalization regions (Fig. 6A, left side), one-sample t tests compared differences in pattern similarity against the null (no difference) revealing category bias in VMPFC (t(39) = 2.21, p = 0.0165, d = 0.35, BF10 = 2.9) and MTG (t(39) = 2.06, p = 0.023, d = 0.33, BF10 = 2.2) although category bias in MTG did not survive correction for multiple comparisons. In the control regions (Fig. 6A, right side), we found evidence for category bias in LO (t(39) = 2.18, p = 0.0175, d = 0.34, BF10 = 2.8) that did not survive correction for multiple comparisons. Overall, two of the hypothesized generalization regions, as well as LO, showed some evidence of category bias in neural representations of individual faces.
Category-biased neural representations within a priori ROIs and across the whole brain. A, Category-biased face representations measured as normalized neural pattern similarity differences between pairs of faces that share a parent and belong to the same category and pairs of faces that share a parent but belong to different categories. Error bars denote standard error of the mean. B, Neural pattern similarity searchlight map for category representations (shared parent same family name – shared parent different family name).
Whole-brain pattern similarity searchlight analysis
Next, we used the whole-brain searchlight approach to perform the neural pattern similarity analysis and look for a neural category bias (shared parent-same category > shared parent-different category) across the entire brain. The analysis revealed a single cluster in bilateral frontal pole/medial prefrontal cortex that survived thresholding (599 voxels, peak: MNI −6, 68, 4; t = 4.10; Fig. 6B).
Together, findings from the ROI and searchlight analyses indicate that the presence of a shared category label biased neural representations to highlight category-relevant information, although task instructions emphasized memory for individual instances and category-irrelevant information was relevant for explicit task goals. Contrary to our prediction, the bias did not appear unique to regions previously implicated in memory generalization.
Discussion
Pior work has indicated that category learning induces a perceptual category bias where items within categories are perceived as more similar to one another than items across category boundaries. We recently extended those findings beyond traditional category learning to a task where category-irrelevant features remain important for explicit task goals requiring individuating stimuli within each category (Ashby et al., 2020). Here, we used the same paradigm to examine neural evidence for the spontaneous formation of category biased representations during encoding itself. Participants learned face-full name associations using blended face stimuli to equate physical similarity within and across category boundaries. Perceived similarity ratings were collected before and after learning and category generalization to never-studied faces was measured in a subsequent categorization task. Individual differences in postlearning category bias in similarity ratings predicted performance on a subsequent generalization task. MVPAs of encoding fMRI data revealed better decoding for category-relevant versus category-irrelevant information in two a priori generalization regions (VMPFC, MTG) and whole-brain searchlight showed that category-relevant information was widespread throughout the cortex, including frontal regions, while category-irrelevant information was more limited. Pattern similarity analysis provided additional evidence that decoding of categories was not solely driven by physical similarity among category members, finding category bias in neural representations (greater similarity for faces from the same category than physically equally similar faces that belonged to different categories) in MTG, VMPFC, frontal pole/medial prefrontal cortex, and possibly LO. Together, our results indicate that category-biased neural representations form spontaneously during encoding and are not merely the product of generalization demands at retrieval. Notably, neural representations were biased toward category-relevant information although participants' goal was to remember the full name for each individual face and the category-irrelevant information was important for the explicit task goals.
Whether related events are linked on-the-fly at retrieval in response to generalization demands (Banino et al., 2016; Carpenter and Schacter, 2017, 2018) or whether they are spontaneously linked during encoding (Shohamy and Wagner, 2008; Zeithamova et al., 2012) remains debated in the literature (for review, see Zeithamova and Bowman, 2020). Our prior work provided preliminary behavioral evidence that a category bias in perceptual similarity ratings may be a good indicator of the formation of generalizable category knowledge before explicit generalization task demands (Ashby et al., 2020). While measuring the perceptual category-bias after learning greatly minimized generalization task-demands, it was not possible to rule out that probing similarity judgments itself may have induced a strategic decision to rate faces with the same last name as more similar to one another. The current study allowed us to directly observe neural evidence that representations of conceptually related faces are already biased to emphasize category-relevant information during learning, although both category-relevant and category-irrelevant information is important for explicit task goals. These results indicate that links between related events do not have to result only from task demands at retrieval. Instead, related information may be organized into meaningful clusters spontaneously during encoding to support the formation of generalizable knowledge.
We predicted that representations biased toward category relevant information will form in regions implicated in concept and schema representations and memory generalization, such as VMPFC (Kumaran et al., 2009; Bowman and Zeithamova, 2018) and MTG (Webb et al., 2016; Turney and Dennis, 2017). Our prior work identified abstract category representations predominantly localized to the VMPFC, MTG and hippocampus in a task where all stimulus features were category relevant (Zeithamova et al., 2008; Bowman and Zeithamova, 2018; Bowman et al., 2020). In the current study, the task included learning which features are relevant and irrelevant to determine category membership. MTG and VMPFC showed evidence for forming category-biased representations, but representations highlighting category-relevant information did not appear unique to these regions. Although the results from the rest of the brain were more ambiguous, several aspects of our data suggest that category-biased representations may be relatively widespread. For example, ROI pattern similarity analysis and the cross-run decoding analysis suggested that LO representations, while sensitive to physical similarity among stimuli sharing both relevant or irrelevant parent, may also be biased toward representing category-relevant features. This would align with other findings of category effects in visual regions, such as through top-down modulations from prefrontal cortex (Folstein et al., 2013, 2015). Using functionally defined visual regions that are sensitive to faces (e.g., fusiform and/or occipital face areas) in future studies may provide more definitive evidence with respect to category-biased representations in visual cortex.
Furthermore, classification of category-relevant information during learning was observed relatively widespread across the brain, from early visual regions to high-level cognitive regions in parietal and prefrontal cortices. The success in classifying category-relevant information was not readily attributable to the physical similarity alone, as classification of category-irrelevant information that equally affected physical similarity was more confined and did not extend to the prefrontal cortex. The differences between relevant and irrelevant decoding were especially notable when using cross-run cross-validation, probably reflecting sensitivity to representational shifts across learning, although a direct relevant versus irrelevant contrast yielded no clusters that survived correction. Dorsomedial prefrontal cortex, which was not one of our a priori ROIs, also demonstrated category-biased representations in the pattern similarity searchlight, although the regional specificity of the searchlight analyses should be interpreted cautiously (Etzel et al., 2013). Thus, it is likely that many regions beyond the putative memory integration regions may form representations highlighting category-relevant stimulus features.
One reason why the presence of a category label may lead to category-biased representations is attention bias toward relevant features. A study by Mack et al. (2013) used neural pattern similarity analysis and detected representations of category exemplars distributed across prefrontal, parietal, and occipital visual cortices, but only when category learning-related attention weights to individual stimulus features were considered. Neural representations tracking physical similarity among stimuli (without accounting for increased attention to stimulus features relevant for classification) appeared confined to the visual cortex. Likewise, we found category-relevant information to be decodable across large portions of the brain, including prefrontal and parietal regions theorized to bias attention toward task-relevant information (Gabrieli et al., 1998; Poldrack et al., 1999; Wagner et al., 2001; Seger and Cincotta, 2005; Deng et al., 2008). We speculate that the neural category bias may reflect an attentional shift to category-relevant features, which employs a large extent of the brain rather than being specific to regions implicated in generalization and memory integration.
The current study builds on a strong line of cognitive research on concept learning and categorization. Many concept learning theories assume that a key part of learning is an allocation of attentional resources away from category-irrelevant information to category-relevant features (Nosofsky, 1986). This results in a stretching and shrinking of perceptual space along the relevant and irrelevant stimulus dimensions to reduce perceived differences among items within a category and increase discriminability of items belonging to different categories (Goldstone, 1994; Beale and Keil, 1995; Kurtz, 1996; Livingston et al., 1998; Goldstone et al., 2001; Gureckis and Goldstone, 2008; Folstein et al., 2013; Soto, 2019). Notably, we extend prior work on category bias to a task where the category-irrelevant information cannot merely be ignored (as is possible in a traditional category learning task) and instead is important for explicit task goals. Interestingly, we found neural evidence of category bias although the average behavioral category bias in similarity ratings, observed in this paradigm previously (Ashby et al., 2020; Bowman et al., 2021), was not significant in the current sample. This suggests that the neural measures may be earlier or more sensitive indicators of bias than explicit behavior, as observed for neural markers in other domains (Gabrieli et al., 2015; Farah, 2017).
In conclusion, categorical knowledge alters how we perceive the world. The current findings extend prior behavioral work on category bias in perception to demonstrate category bias in the neural representations of individual stimuli. Notably, category bias can form spontaneously during learning, in the absence of explicit categorization demands, and even when task goals emphasize differentiation of individual stimuli. These results inform our understanding of bias in other domains, such as gender or race biases, and suggest that bias may emerge when category information is present even when one's explicit focus is on individuals.
Footnotes
This work was supported by the National Institute for Neurological Disease and Stroke Grant R01-NS-112366 (to D.Z.).
The authors declare no competing financial interests.
- Correspondence should be addressed to Dagmar Zeithamova at dasa{at}uoregon.edu