Abstract
Effective generalization in a multiple-category situation involves both assessing potential membership in individual categories and resolving conflict between categories while implementing a decision bound. We separated generalization from decision bound implementation using an information integration task in which category exemplars varied over two incommensurable feature dimensions. Human subjects first learned to categorize stimuli within limited training regions, and then, during fMRI scanning, they also categorized transfer stimuli from new regions of perceptual space. Transfer stimuli differed both in distance from the training region prototype and distance from the decision bound, allowing us to independently assess neural systems sensitive to each. Across all stimulus regions, categorization was associated with activity in the extrastriate visual cortex, basal ganglia, and the bilateral intraparietal sulcus. Categorizing stimuli near the decision bound was associated with recruitment of the frontoinsular cortex and medial frontal cortex, regions often associated with conflict and which commonly coactivate within the salience network. Generalization was measured in terms of greater distance from the decision bound and greater distance from the category prototype (average training region stimulus). Distance from the decision bound was associated with activity in the superior parietal lobe, lingual gyri, and anterior hippocampus, whereas distance from the prototype was associated with left intraparietal sulcus activity. The results are interpreted as supporting the existence of different uncertainty resolution mechanisms for uncertainty about category membership (representational uncertainty) and uncertainty about decision bound (decisional uncertainty).
Introduction
Generalizing learned categorical knowledge and applying it to new stimuli is crucial for adapting to novel situations. Category structure is graded: different exemplars have different degrees of membership. According to prototype learning theories (Minda and Smith, 2001; Love et al., 2004), novel stimuli are categorized as members or nonmembers of a single category (an A/not-A task) based on similarity to the average stimulus (Davis et al., 2014). The degree of membership for stimuli is often operationalized as distance in perceptual space from the prototype (Smith and Minda, 2002). Previous neuroimaging studies investigating generalization to novel regions of perceptual space using A/not-A tasks report greater activity for more distant stimuli in visual regions and across a frontal–parietal–striatal network (Reber et al., 1998; Daniel et al., 2011). In a single-unit recording study in macaques, Antzoulatos and Miller (2011, 2014) found that striatal neurons were active when categorizing previously trained stimuli, whereas prefrontal cells were recruited for generalizing categorical knowledge to novel exemplars. Learning to categorize stimuli into two categories (A/B tasks) is more complex: it requires consideration of both potential membership in each category and the decision bound between the categories. A/B category learning-related changes have been reported in a similar frontoparietal–striatal network (Shohamy et al., 2008; Seger et al., 2010).
We examined generalization to novel regions of perceptual space in an A/B task and dissociated neural systems underlying generalization from those underlying conflict resolution during decision bound implementation. We used an information integration paradigm in which stimuli are formed by varying two features, bar width and orientation (Fig. 1), resulting in two categories separated by a diagonal decision bound. This task reliably recruits striatal and cortical regions generally associated with category learning (Seger and Cincotta, 2002; Cincotta and Seger, 2007; Waldschmidt and Ashby, 2011). We pretrained subjects on stimuli from a training region (Fig. 1, red region), and then, during scanning, they categorized novel stimuli from the training region and three transfer regions: (1) parallel to the decision bound (“Flanking”; pink region); (2) farther from the decision bound (“Far”; black region); and (3) between the training regions along the decision bound (“Boundary”; blue region). The Flanking and Far stimuli had similar distances from training region stimuli. This approach allowed us to perform categorical analyses comparing activity between conditions and also to perform parametric analyses implementing two independent measures of distance in perceptual space: (1) distance from the decision bound; and (2) distance from the appropriate category prototype.
Generalization was assessed in the parametric analyses examining greater distance from the prototype or decision bound and categorical analyses comparing Flanking and Far stimuli with Training stimuli. We predicted that generalization would recruit lateral frontoparietal regions, particularly the lateral prefrontal cortex (Antzoulatos and Miller, 2011), and regions along the intraparietal sulcus (IPS; Freedman and Assad, 2006; Braunlich et al., 2015), as well as regions of the basal ganglia, in particular regions of the posterior caudate associated with successful category learning (Seger, 2008).
Decision bound implementation was assessed in the parametric analysis in terms of closeness to the bound and in the categorical analyses comparing the Flanking and Training regions with the Far region. In this task, decision bound implementation requires resolution of conflicting potential category memberships; stimuli near the decision bound could potentially be members of both categories. We hypothesized that this resolution would require executive functions important in resolving response conflict and recruitment of the frontoinsular and medial frontal cortical regions known as the salience network (Seeley et al., 2007; Menon and Uddin, 2010; Ham et al., 2013). The salience network is active during conflict processing across a wide range of psychological domains (Fan et al., 2014; Jung et al., 2014; Silvetti et al., 2014). This hypothesis is supported by previous studies in which subjects were instructed to classify stimuli using a specified decision bound that found that stimuli near the bound were associated consistently with activity in the medial prefrontal and frontoinsular cortices (Grinband et al., 2006; White et al., 2012).
Materials and Methods
Subjects.
Nineteen subjects (seven males, 12 females) were recruited from the undergraduate student population at South China Normal University. All met criteria for MR scanning and were paid for their participation. A total of three were excluded after preprocessing for excessive motion during the scan (>2 mm shift or 2° of rotation), resulting in a total of 16 included in analyses.
Materials.
Stimuli were circular sine wave gratings, sometimes termed Gabor patches, that varied only in spatial orientation and frequency (Fig. 1). To generate these stimuli, we first generated points within an “unrotated” arbitrary 0:700 space. We then rotated these points by 45° such that the optimal rule required the integration of both stimulus dimensions (Fig. 1) and then transformed these arbitrary dimensions into frequency and orientation parameters used to generate the gratings. Before rotation, Training cluster stimuli were drawn from two bivariate uniform distributions with 75 stimuli within each cluster (both clusters: mean Y = 350, SD Y = 115, SD X = 15.5; Category A: mean X = 260; Category B: mean X = 440). Stimuli in the Boundary clusters were drawn from two bivariate normal distributions with 38 stimuli within each cluster (both clusters: mean Y = 350, SD Y = 80, SD X = 13; Category A: mean X = 320; Category B: mean X = 320). Stimuli in the Far clusters were drawn from two bivariate normal distributions with 75 stimuli within each cluster (both clusters: mean Y = 350, SD Y = 53, SD X = 15.5; Category A: mean X = 100; Category B: mean X = 600). Stimuli in the Flank clusters were drawn from four bivariate normal distributions with 25 stimuli within each cluster [the mean of each cluster was 400 units from the center of the Training cluster on the Y dimension (350 ± 400), each cluster SD X = 15.5, SD Y = 15.5; Category A: mean X = 260; Category B: mean X = 440]. After rotation, we transformed this arbitrary x, y space into orientation and frequency space according to the following linear equations: orientation (in degrees from horizontal counterclockwise) = x/7.7778; frequency (in cycles per visual degree) = (y × 0.001) + 0.25.
The particular values for the distributions underlying the stimuli were selected after extensive pilot testing. The goal was to select training distributions that would require subjects to integrate information across both dimensions (i.e., one in which use of a unidimensional rule would lead to low accuracy) but could also be learned to a high degree of accuracy in a 1 h session. Previous studies of transfer in information integration learning to adjacent regions of stimulus space found higher accuracy for stimuli farther from the bound and lower accuracy in stimuli near the bound (Maddox et al., 2005; Maddox and Filoteo, 2011). Casale et al. (2012) examined transfer to stimuli at considerable distance from the Training region along the stimulus bound and found very low accuracy (in contrast with rule-based strategies, which support high generalization accuracy). They argued that such transfer involves analogical processing based on verbalizable rule knowledge, which is not present in information integration. Our goal was to maintain the same accuracy level for Flanking stimuli as the Training condition; this was necessary because we wanted to examine accurate categorization and so that any differences in neural activity between the two conditions could not be attributable to simple accuracy differences. For the Far stimuli, higher accuracy was unavoidable, so we chose to instead equate mean distance for both Far and Flanking stimuli from the Training region. As described in Results (see Behavioral results), our manipulations were successful: Flanking and Training were categorized at equivalent levels of accuracy in the transfer stage. Boundary stimuli were located between the two Training regions along the decision bound and were included to more completely represent the full distribution of distances from decision bound in the model-based analyses.
Procedure.
Subjects participated in two sessions: (1) a training session in a behavioral testing laboratory in the School of Psychology; and (2) a scanning session at the Brain Imaging Center. All subjects completed the training session in the morning and the scanning session in the afternoon of the same day. During the study, subjects were given written instructions in English and spoken instructions in Mandarin Chinese (their native language). All subjects had studied English previously, but to ensure comprehension, Chinese-speaking research assistants discussed the instructions with subjects in Chinese before beginning testing procedures. During training, subjects learned to categorize the stimuli through trial and error. On each trial, they saw a stimulus, made a decision about the correct category response (a button press with the right or left hand) and received feedback about whether they were correct. After each correct categorization decision, the word “Correct!” was shown for 0.75 s in green. After incorrect decisions, the word “Wrong” was shown for 0.75 s in red. Subjects trained until they reached a 90% accuracy criterion 10 times. This criterion was polled after 20 trials and was reset after 22. During training before scanning, subjects experienced only stimuli from the two training clusters.
After training, we checked that all subjects were using an information integration strategy by comparing several decision bound models (Ashby, 1992; Maddox and Ashby, 1993; Maddox and Filoteo, 2011; Ell and Ashby, 2012). The key model was the general linear classifier (GLC), which assumes an information integration strategy characterized by a diagonal decision bound and determines the best fitting bound based on the subject's responses. We compared the GLC with several additional models. Three models assumed a rule-based strategy. Two of the rule-based models assume unidimensional rules: (1) one based on orientation alone that determined the best-fitting vertical decision bound; and (2) one based on frequency alone that determined the best-fitting horizontal decision bound. The third rule-based model, the general conjunctive classifier (GCC), tests for a strategy of combining a value on one dimension with a value on the other as a conjunctive rule. In addition, we fit two guessing models: (1) one that tested for random responding; and (2) one that tested for a bias toward one response. To determine which model best fit each subject's behavioral data, we compared models with the Bayesian information criterion (Schwarz, 1978). All subjects who participated in the scanning sessions met a minimum criterion that their performance during the last tercile of the training trials was best fit by the GLC and thus consistent with information integration strategy use.
Subjects performed three fMRI scans during the scanning session, each of which consisted of 112 task trials. During each scan, 28 stimuli were drawn randomly from each of the four conditions (14 of which were drawn from Category A and 14 were drawn from Category B), as illustrated in Figure 1. The trial format was the same as during training, but to maximize scanning time while minimizing forgetting or strategy shifts that might occur in the absence of feedback, feedback was provided on 50% of the trials for each condition and category. The interval between trials was jittered according to a positively skewed geometric distribution, with a minimum interval of 2 s, a mean of 3.8, and a maximum of 9.5 s. Because the presentation of feedback was stochastic (occurring on only 50% of trials), we used a shorter duration between response and feedback. This interval was either 0.75 s (75% probability) or 2.25 s (25% probability). We optimized the efficiency of the design through random permutation testing with in-house software.
Image acquisition.
Images were obtained with a 3.0 tesla MRI scanner (Siemens) at the Brain Imaging Center at South China Normal University. The scanner was equipped with a 12-channel head coil. Structural images were collected using a T1-weighted magnetization-prepared rapid gradient echo sequence [256 × 256 matrix; field of view (FOV), 256; 192 1-mm slices]. Functional images were reconstructed from 25 axial oblique slices obtained using a T2*-weighted two-dimensional echoplanar sequence (repetition time, 1500 ms; echo time, 30 ms; flip angle, 76; FOV, 220 mm; 64 × 64 matrix; 4.5-mm-thick slices). The first three volumes, which were collected before the magnetic field reached a steady state, were discarded.
Preprocessing.
Image preprocessing was performed using SPM 8 (http://www.fil.ion.ucl.ac.uk/spm/software/spm8). Preprocessing involved correction of slice time acquisition differences, motion correction of each volume to the first volume of the first run using third-degree B spline interpolation, and coregistration of the functional to the structural data. Individual subject high-resolution anatomical volumes were segmented and then normalized into Montreal Neurological Institute (MNI) space using the Eastern template of SPM, which is appropriate for Asian subjects. The resultant deformations were subsequently applied to the functional images. Spatial smoothing was performed with a 6 mm full-width at half-maximum Gaussian kernel. We additionally applied a 128 s high-pass temporal filter.
General linear model analyses.
We performed two types of general linear model (GLM) analyses: (1) one with the stimuli coded categorically (four conditions: Flanking, Far, Boundary, and Training); and (2) one incorporating a parametric modulation based on distances from the prototype and from the decision bound. For both, we modeled the duration of each stimulus event as the difference between the stimulus onset and the reaction time, and the duration of the feedback as the difference between its onset and offset (0.75 s). Such variable duration epoch models have been shown to be more sensitive for cognitive events of variable durations than impulse or constant-epoch models (Grinband et al., 2008). In the categorical model, we included separate mean regressors for the Training, Flanking, Boundary, and Far conditions. We included only correct trials in our primary analyses but included incorrect trials in the design matrix as a single regressor of no interest. We also included regressors for correct and incorrect feedback. In the parametric model, we did not include separate categorical regressors for the different conditions. Instead, we included a single mean regressor for all stimuli and included two parametric modulators: (1) distance from the bound; and (2) distance from the center of the Training cluster (the prototype). The values for each stimulus for these regressors are illustrated in the right column of Figure 1. It should be noted that, for the parametric analyses, we turned off the automatic orthogonalization in SPM, which allowed us to investigate the unique effects of each regressor. Therefore, these parametric modulators allowed us to investigate changes along the respective generalization gradients. As with the categorical model, we considered only correct trials for our primary analyses but included incorrect trials in the design matrix as regressors of no interest. We also included regressors for correct and incorrect feedback. We did not explicitly model a baseline task but instead compared conditions when appropriate with an implicit baseline consisting of the mean signal during unmodeled time points. Using an implicit baseline avoids the requirement of assuming that baseline time point activity will follow a specific hemodynamic response function and saves one degree of freedom in statistical analyses. All analyses were corrected for multiple comparisons using the topological false discovery rate method (Chumbley et al., 2010), with an initial (uncorrected) threshold set to p < 0.00001 and a corrected threshold of p < 0.05.
There is still controversy over whether the underlying cognitive level representation of categories incorporates explicit representations of prototypes and decision bounds or whether apparent use of prototypes and decision bounds emerges from other processes. Exemplar theories propose that subjects categorize new stimuli based on their average distance from all previously experienced instances (Kruschke, 1992). The striatal pattern classifier model developed by Ashby et al. (2011) proposes that subjects map small regions of perceptual space to category membership. Within our stimulus space, in which exemplars were distributed evenly across the stimulus space of the training regions, exemplar, prototype, and decision bound models make similar predictions as to the effect of distance. Our parametric regressors are intended to identify regions of the brain in which activity varies along particular generalization gradients but cannot speak to underlying mechanisms.
Results
Behavioral results
All subjects successfully reached the training criterion of 10 blocks of 20 trials with accuracy above 90%. In addition, all subjects' performance in the final tercile of training was best fit by the GLC model, indicating predecisional integration of information across the dimensions (Maddox and Filoteo, 2011; Ell and Ashby, 2012).
During the scanning session, subjects categorized stimuli from the four regions of perceptual space. As illustrated in Figure 2, accuracy differed between these conditions (F(3,45) = 81.25, p < 0.001, η2 = 0.74). Subjects were most accurate in the Far condition (mean ± SD, 97 ± 0.06%) with intermediate accuracy in the Training (mean ± SD, 77 ± 0.11% correct) and Flanking (mean ± SD, 80± 0.09% correct) conditions. They were least accurate in the Boundary condition (mean ± SD, 58 ± 0.05%). Pairwise comparisons indicated that the Far condition was significantly more accurate than the Training, Flanking, and Boundary conditions (t(15) = 8.39, t(15) = 6.37, t(15) = 28.6, respectively; all p values < 0.05). Training and Flanking were significantly more accurate than Boundary (t(15) = 8.4, t(15) = 8.17, respectively; p values < 0.05). Training and Flanking did not differ (t < 1.0).
As can be seen in Figure 2, reaction times overall paralleled accuracy. Reaction time was significantly different between conditions (F(3,45) = 17.77, p < 0.001, η2 = 0.15). Responses were fastest in the Far condition (mean ± SD, 824 ± 125 ms), intermediate in Training (mean ± SD, 952 ± 174 ms) and Flanking (mean ± SD, 929 ± 157 ms), and slowest in Boundary (mean ± SD, 1016 ± 211 ms). Pairwise comparisons indicated that the Far condition was significantly faster than Training, Flanking, and Boundary conditions (t(15) = 5.81, t(15) = 5.07, t(15) = 5.47, respectively). Training and Flanking were significantly faster than Boundary (t(15) = 3.73, t(15) = 2.44, respectively). Training and Flanking did not differ (t < 1.0) consistent with previous behavioral studies (Maddox et al., 2005; Maddox and Filoteo, 2011).
Overall, the behavioral results match our predictions. Subjects maintained high accuracy when categorizing stimuli from the same regions on which they were trained (Training); accuracy and speed both improved when subjects generalized to stimuli farther from the decision bound (Fars), and both decreased when subjects categorized stimuli closer to the decision bound (Boundary). As in our pilot testing, Flanking and Training stimuli had similar accuracy and reaction time. This indicates that subjects were able to generalize to the novel regions of perception space while maintaining high levels of performance.
Model-based analyses indicated that subjects were able to successfully maintain an information integration strategy within the scanning session despite the change of context to the scanning environment, the introduction of transfer stimuli from new regions of perceptual space, and reduction in proportion of feedback to half the trials. Of the 16 subjects whose data was included in the analyses, 13 had behavior fit best by the GLC, indicating an information integration strategy, throughout all three terciles of the scanning task. The remaining three subjects' performance was best fit by the GLC in one or two of the terciles and one of the other strategies (GCC, guessing, or unidimensional rule-based classifier) on the remaining terciles.
fMRI contrasts
Our first set of analyses was to verify that the predicted cortical network commonly recruited across categorization tasks was active in this task as well. To do this, we examined Training region stimuli with baseline as illustrated in Figure 3 (blue overlay) and Table 1. Primary regions of activity included bilateral visual cortical regions extending from the primary visual cortex anterior and laterally to the inferior temporal gyri, regions of the basal ganglia including the bilateral head of the caudate, putamen, and tail of the caudate, medial frontal cortex in the vicinity of the anterior cingulate, and bilateral inferior parietal cortex along the IPS. The same network was mostly shared by across all conditions, as is shown in a conjunction analysis of all four individual contrasts versus baseline, illustrated in Figure 3 (red overlay).
We then compared individual conditions illustrative of generalization and decision bound implementation. For generalization, the primary contrasts of interest were those comparing the two conditions equated for distance from the training region, Far and Flanking, with the Training and Boundary regions, and each other. Although Far and Flanking regions were equally distant from the Training region, they differed in distance from the decision bound, so that differences between them can be attributed to factors related to implementation of the decision bound. Overall, similar regions were activated in contrasts comparing Far with other conditions. As illustrated in Figure 4 and Table 2, Far stimuli activated right and left regions of the superior parietal lobe, extending from the IPS to the postcentral gyrus. In addition, Far stimuli were associated with activity across large regions of the medial occipital lobe, including the lingual gyrus and cuneus, extending to the fusiform gyrus. For the Flanking condition, when compared with Boundary region stimuli, there was activity in the left inferior parietal lobe in the region of the IPS. There were no additional areas of activation when Training, Boundary, and Flanking were compared with each other.
For decisional bound implementation, the primary contrasts of interest were those comparing the conditions near the decision bound with Far. As described in Table 2 (Near Bound Conditions > Far) and illustrated in Figure 4 (for Flanking > Far) all three contrasts (Flanking > Far, Training > Far, Boundary > Far) showed similar recruitment of salience network regions, including the bilateral frontoinsular and medial frontal cortices.
Parametric models
We formed two parametric regressors reflecting different distances in perceptual space. One was distance from the decision bound. As shown in Table 3 and Figure 5, activity in the bilateral frontoinsular and medial frontal cortices, midbrain, and thalamus increased as stimuli were closer to the decision bound. Conversely, the farther stimuli were from the decision bound, the higher the activity in the superior parietal, lingual, bilateral putamen, and anterior hippocampus. The second parametric regressor examined distance from the prototype for each category. This regressor isolated activity in one region, the left inferior parietal cortex, which increased in activity as stimulus distance from the prototype increased. No regions were activated significantly in the reverse contrast as a function of being closer to the prototype.
Feedback-related activity
Overall, feedback (both correct and incorrect compared with implicit baseline) activated inferior temporal and fusiform visual regions, which was expected because of the visual nature of the feedback. In addition, the bilateral angular gyrus and left lateral prefrontal cortex were active. The anterior caudate and anterior putamen were significantly active compared with correct feedback with baseline but not compared with incorrect with baseline, consistent with studies finding greater striatal activity for correct feedback, potentially as a result of the rewarding and informational properties of such feedback (Seger et al., 2010; Tricomi and Fiez, 2012). In a direct comparison of correct versus wrong feedback, wrong led to significantly greater activity in the salience network, consistent with research finding error-related activity in these regions (Klein et al., 2007; Ham et al., 2013). Correct feedback led to significantly greater activity in small regions of the precuneus and the supplementary motor area.
Discussion
We identified neural regions important for generalization and dissociated generalization processes from general-purpose executive functions required for implementation of decision criteria. Recent theoretical papers have argued that categorization should be considered a type of decision making (Seger and Peterson, 2013). One important aspect of decision making that categorization tasks are particularly suited for examining is uncertainty (Bach and Dolan, 2012; Ma and Jazayeri, 2014). To generalize, one must assess whether the novel stimulus is sufficiently related to previously studied stimuli to be considered a category member. Because category membership is graded, judgments are inherently uncertain; we refer to this as stimulus or representational uncertainty. We found that generalization to stimuli farther from the prototype corresponded with activity in left IPS, and generalization to stimuli farther from the decision bound corresponded with activity in superior parietal, lingual, and anterior hippocampal regions. Categorization tasks, particularly those using an A/B design, entail uncertainty about the decision bound dividing the two categories; we refer to this as decisional uncertainty (Seger and Peterson, 2013). We operationalized decisional uncertainty as closeness to the decision bound and found it recruited regions of the salience network.
Parietal and frontal roles in generalization
The primary regions associated with categorization and generalization were in the parietal lobe along the IPS (Fig. 6). The IPS has been associated with accumulation of information for perceptual decisions in many functional imaging studies (Ploran et al., 2007, 2011; Kayser et al., 2010; Sestieri et al., 2014). Neural activity in macaque lateral intraparietal (LIP) area reflects an accumulation of perceptual information relevant for a decision (Shadlen and Newsome, 2001; Shadlen and Kiani, 2013; Ibos and Freedman, 2014) and is sensitive to category boundaries (Freedman and Assad, 2006; Swaminathan and Freedman, 2012). The human homolog of area LIP is thought to be in the IPS, medial to the inferior parietal lobe (Sereno and Huang, 2014).
In addition to recruitment of IPS across conditions, some parietal regions showed greater activity for generalization. Distance from the prototype recruited a relatively lateral and anterior region of the left IPS, close to previously reported regions in A/not-A tasks (Zeithamova et al., 2008; Daniel et al., 2011). Greater distance from the decision bound recruited regions superior to IPS, close to regions active in an A/B task in the study by Zeithamova et al. (2008). The overall pattern of greater superior parietal activity farther from the decision bound parallels results in decision making and memory studies. White et al. (2012) found that classifying stimuli closer to a decision bound recruited IPS, whereas farther from the bound recruited the superior parietal cortex. In declarative memory tasks, superior parietal regions are associated with novelty decisions, whereas IPS and adjoining inferior parietal regions are associated with memory decisions (Vilberg and Rugg, 2008; Nelson et al., 2010; Kim, 2011; Johnson et al., 2013; Hutchinson et al., 2014); this difference between “new” and “old” memories may reflect implementation of a memorial response criterion.
In macaque research, the lateral prefrontal cortex has been associated with generalization (Antzoulatos and Miller, 2011, 2014; Swaminathan and Freedman, 2012), but we did not find such activity. It is possible that, in humans, prefrontal regions may only be required for abstraction to much more distantly related stimuli or in rule-based categorization.
Decision-bound conflict resolution
We found greater activity in the medial frontal and frontoinsular regions of the salience network for stimuli closer to the decision bound, which we interpret as attributable to greater conflict between categories for near-bound stimuli. The medial frontal cortex in particular has been associated with cognitive control of conflict (Rushworth and Behrens, 2008). Mechanisms underlying conflict sensitivity in the salience network are still unclear; current models include those based on actor–critic reinforcement learning (Silvetti et al., 2014) and information theory entropy measures (Fan et al., 2014).
Basal ganglia and hippocampus
Early research posited competitive interactions between the basal ganglia and hippocampal systems during category learning (Poldrack et al., 1999, 2001), but recent studies have found parallel recruitment when task demands require the computational functions of each region (Seger et al., 2011; Davis et al., 2012a,b). Both the hippocampus and basal ganglia showed complex patterns of activation in our study (Fig. 7).
Basal ganglia activation differed from seen by Antzoulatos and Miller (2011), who found caudate neuron activity when categorizing previously learned stimuli but not during generalization. Striatal activity during generalization was at least as great as for training stimuli, with some regions showing increased activity. We interpret basal ganglia activity according to the multiple corticostriatal loops framework (Ashby et al., 1998; Seger, 2008). The posterior caudate interacts with visual cortical regions and is associated with stimulus categorization (Seger and Cincotta, 2005; Seger et al., 2010; Lopez-Paniagua and Seger, 2011). The posterior caudate was recruited equally in all conditions; this may indicate that category knowledge acquired during training extended to the transfer regions. We found greater posterior putamen activity for Far stimuli. The posterior putamen is associated with motor functions (Peterson and Seger, 2013); however, motor demands were consistent across conditions in this study. Putamen activity may be attributable to more confident and/or faster responses in the Far condition; the putamen is active during speeded responses in studies investigating the speed–accuracy tradeoff (Mulder et al., 2014). Motor responses for Far stimuli may also be better learned; previous studies have shown a shift to the putamen during intermediate stages of motor skill development (Waldschmidt and Ashby, 2011).
The anterior hippocampus showed a significant negative deflection for Boundary stimuli, consistent with previous category learning studies (Poldrack et al., 1999, 2001; Seger and Cincotta, 2005). Seger et al. (2011) found that the anterior hippocampus was associated with novelty processing: activity was above baseline the first time a stimulus was presented but decreased below baseline during additional repetitions. Thus, one might be tempted to invoke novelty as a factor; Far stimuli are more extreme and so perhaps more novel than Boundary and Training. However, the lack of a similar effect in Flanking argues against this.
Relationship to the COVIS model
The COVIS (for “COmpetition between Verbal and Implicit Systems”) model posits that information integration categories are learned by mapping regions of perceptual space to categorical responses via the corticostriatal system passing from the visual cortex through the posterior caudate to premotor regions (Ashby et al., 1998, 2011; Cantwell et al., 2015). In COVIS, categorization relies on striatal units that map regions of perceptual space surrounding stimuli to categories. The size of these regions may depend on task factors, with the default being a small number of units each representing a relatively large region and more units being required if more fine-grained distinctions must be made, such as when representing a category defined as multiple discontinuous regions of perceptual space (Maddox and Filoteo, 2011). Given that our categories were relatively large and continuous, the COVIS model would suggest that our subjects may have recruited fewer striatal units each representing a large region of perceptual space, with generalization being primarily attributable to overlap between the transfer (Far and Flanking) regions and these large regions. This interpretation is supported by our finding of a similar degree of posterior caudate activity across all conditions. Alternatively, rule-based knowledge might have been involved, particularly for the Far region stimuli. Casale et al. (2012) found no transfer to a distant region in an information integration task but excellent transfer across a similar distance in a rule-based task, implying that prefrontally mediated verbalizable rules play a key role for generalization across extensive distances. Additional research using larger distances in perceptual space is necessary to determine the boundary conditions between when generalization can be accomplished through the striatal system and when it is necessary to shift to a rule-based system.
A/B versus A/not-A tasks and multiple memory systems
Our results also provide additional insight into the functions underlying A/B task performance that may help to explain differences between A/B and A/not-A tasks. A/not-A task performance is preserved in patients with amnesia and related disorders, but A/B is not (Reed et al., 1999; Smith and Minda, 2001; Smith and Grossman, 2008), which has been interpreted as evidence that these tasks recruit independent memory systems. Zeithamova et al. (2008) compared early A/not-A and A/B learning and found higher frontoparietal and hippocampus activity in A/B and higher visual cortex and basal ganglia in A/not-A. Casale and Ashby (2008) argued on the basis of behavioral work that A/B tasks involve recruitment of a broader array of memory systems than A/not-A tasks, which might rely on perceptual representations. One important difference between A/not-A and A/B tasks is that decision bound implementation is of greater importance in A/B tasks; although A/not-A tasks do require implementation of a decision threshold separating category members from nonmembers, the tasks used in neuropsychological studies made low demands on these processes in that category members needed only be distinguished from completely random stimuli.
Conclusion
We successfully dissociated neural regions underlying generalization (representational uncertainty) from those underlying conflict resolution in implementation of decision criteria (decisional uncertainty). Categorization tasks may prove particularly useful in other decision-making domains for studying transfer and generalization to novel situations.
Footnotes
This study was supported by National Natural Science Foundation of China Grants 31170997 and 31371050 and National Institutes of Health Grant R01 MH079182 (C.A.S.). Steven Hutchinson and Shawn Ell provided decision model fitting code. We thank Prof. Lei Mo for helpful advice, Wenliang Lu, Peiwen Xiang, and Quiying Liu for running subjects, and R. Marit Smith for assistance on a pilot version of this study.
The authors declare no competing financial interests.
- Correspondence should be addressed to Carol A. Seger or Zhiya Liu, Center for the Study of Applied Psychology, Key Laboratory of Mental Health and Cognitive Science of Guangdong Province, School of Psychology, South China Normal University, Guangzhou 510631, China, Carol.Seger{at}colostate.edu or zhiyaliu{at}scnu.edu.cn