Applying learned rules to solve new problems and adapting responses after rule change are fundamental aspects of intelligence. How we accomplish these tasks has long intrigued researchers, but the experiments conducted to answer this often focus on a single species. This makes it challenging to identify general principles of cognition and build frameworks for neural implementation and algorithms that may inspire the development of artificial intelligence. Only recently have studies begun to examine behavioral strategies across species within a shared decision-making paradigm, adopting rigorous computational modeling for direct comparisons. Such cross-species comparisons can reveal both unique and common cognitive functions and the underlying neurobiological basis. But despite the emergence of cross-species comparisons, advances in our understanding of rule learning and flexible decision-making have been limited, however, by the use of simplistic stimuli, with only one feature varying from trial to trial. This contrasts the complexity of real-life decision-making, where one must consider multiple features to determine the best option. This limitation can be addressed using the Wisconsin Card Sorting Test (WCST), which requires inference of multiple rules that govern reward outcome. This test has been used extensively to examine flexible rule-based cognition in primates (Kuwabara et al., 2014; Mansouri et al., 2020).
In the canonical WCST, subjects sort cards into different piles based on one of several visual features, such as shape, color, or number, of graphical elements on the card. Importantly, the sorting rule is not explicitly revealed. Instead, after assigning a card to a given pile, subjects are told whether the card was correctly or incorrectly sorted. Subjects must learn the sorting rule based on this feedback. Rule contingency changes without warning after several correct matches, so subjects should continue to use the outcome feedback to infer what new feature is being rewarded. To perform efficient rule learning and manage several rule switches within a session, subjects must keep track of multiple visual features that vary independently and accurately attribute the feedback to its corresponding feature.
Computational modeling is often used to describe behavioral strategies implicated in decision-making. Such modeling may be hypothesis-driven, incorporating specific assumptions about strategies involved in decision processes, or hypothesis-free, in which no assumptions are made (Calhoun et al., 2019). Previous models of WCST mostly adopted the hypothesis-driven approach and have provided a cognitive control framework for rule learning (Gläscher et al., 2019). For instance, a selective attention model hypothesized that the rate of shifting after rule changes is determined by the level of attention assigned to reward versus punishment (Bishara et al., 2010). A problem with the hypothesis-driven approach, however, is that it often assumes a static strategy throughout the task, overlooking adaptations to task complexity, as seen in recent WCST versions (Goudar et al., 2024), where strategies become less transparent and require further quantification.
In contrast, hypothesis-free models avoid predefined strategy assumptions but instead train computational modules that may emerge to support flexible behaviors. This approach uncovers latent processes and potential cross-species differences empirically. Unsupervised methods have been developed recently to identify internal states of behavior. For example, the hidden Markov model-generalized linear model (HMM-GLM) framework has been effectively adopted to reveal different strategies in rodent behaviors (Calhoun et al., 2019). A recent study by Goudar et al. (2024) applied this framework to uncover and compare strategies for rule learning in humans and macaque monkeys.
Goudar et al. (2024) conducted cross-species comparisons between macaque monkeys and human subjects performing rule learning in a novel and more complicated variant of the WCST. On each trial, the authors presented four items, each possessing one of four distinctive possible values of three dimensions (patterns, shape, and color). As with other versions of the test, subjects tried to select the item with the rewarded feature without being told what the rewarded feature was. Instead, they had to infer the reinforcement rule by trial and error, accumulating feedback across trials. Unlike the canonical WCST, which typically involves just two or three potential rules (e.g., match cards simply based on color or shape), the modified task has 12 possible rules. The combination of visual features, dimensions, and number of items makes it challenging to assign the feedback to a particular potential rule. For example, if one is rewarded for selecting a round, striped, red item, the subject would not immediately know which of these three features was rewarded. The speed at which subjects learn the hidden rule indicates the subject's capacity for rule inference; this is reflected in the learning rates of task performance in the WCST. Different learning rates between subjects suggest that the underlying strategies and mechanisms might be different.
To reveal the specific processes involved in rule learning, Goudar et al. (2024) adopted an unsupervised approach that combines input-dependent hidden states (input–output HMM, which considers various covariates) with GLMs to examine how outcome history influences decision-making. This framework aims to characterize how individuals integrate various sources of information into their choices. The transitions between these hidden states then reflect changes in behavioral strategies. In this model, inputs, including historical choices and rewards, influence both state transitions and choice formation via GLMs, allowing analysis of rule inference through trial history, which is an aspect often overlooked in previous models.
The authors optimized model parameters by varying history lags (the number of previous trials used as inputs) and hidden states, and they found that the best-fit models for humans and monkeys shared the same hyperparameters (i.e., adjustable variables that control how a model gets trained): one lag and four states. This indicates qualitative similarities in decision-making strategies between the two species. Based on feature choice probabilities, the four hidden states were labeled as persist, preferred, random, and avoid states. Further analysis of the most likely states for each trial revealed that both species used common rule inference strategies, such as win-stay, lose-shift, inference-like computation, and simultaneous exploration of multiple features.
Despite these similarities, the authors also identified key features that may contribute to differences in rule-updating speed between species. Monkeys made more perseverative errors than humans, frequently choosing features from previous rules after a rule change and showing less sensitivity to negative feedback. Additionally, monkeys engaged in more random exploration of irrelevant rule features, leading to inconsistent rule following compared with humans. Likely because of these inefficiencies, monkeys underperformed in identifying and switching between rules.
In summary, the input–output HMM-GLM approach used by Goudar et al. (2024) reveals decision-making processes in the WCST without relying on predefined hypotheses. The authors uncovered discrete states associated with task-relevant features involving transitions based on choice outcomes. One may wonder, however, what factors might influence transitions between these hidden states. One possibility is that state transition probabilities vary with task engagement or motivation levels, which might be influenced by species-specific factors. For instance, monkeys may be more focused on rewards from successful rule application, whereas human engagement may be driven by multifaceted incentives, including a subjective feeling of competence or achievement, which may enhance their sensitivity to negative outcome.
Another possibility is that state transitions identified by the HMM-GLM might be determined by changes in one's attention level. According to a Bayesian inference perspective, subjects could apply top-down control by selecting what dimensions are relevant to the current task, then effectively updating the representation of task rules through trial and error to achieve behavioral flexibility (Niv et al., 2015). In the current task, rule switch happens once rule learning reaches a criterion. It is likely that humans may be aware of the criterion and thus pay higher level of attention to monitor their performance to follow rules. Therefore, the attention mechanism could support transitions between hidden states in the WCST task.
What factors might contribute to species differences in rule switching? We speculate that there might be two main sources. On the one hand, the observed differences could be attributed to internal or biological factors inherent to each species, such as variations in prefrontal functioning. For example, higher level of perseverative errors and lower level of sensitivity to errors of monkeys in the current WCST task (Goudar et al., 2024) mirror performance of human patients with prefrontal lesions (Gläscher et al., 2019). Furthermore, it is important to consider species differences in cell types, functions, or gene expression profiles in brain regions relevant to rule learning and memory, such as prefrontal regions and the hippocampus. Ongoing work has focused on understanding how neurons in these regions represent abstract rules and whether there is a latent organization, or mental schema of task structure (Goudar et al., 2023), akin to forming a “cognitive map” in spatial cognition. The field would gain more in-depth understanding from cross-species comparisons of how such mental structures are established in the brain and used to guide flexible behavior.
A related question concerns the level at which rule abstraction occurs, and whether monkeys and humans form similar hierarchical organization of rules. For instance, a recent study revealed differences between human and monkey players in the complexity of strategies used to solve decision-making problems in the video game Pac-Man (Yang et al., 2024). In the context of WCST, it would be interesting to investigate whether and how abstract rule learning (i.e., grouping of features and generalization between dimensions) and concrete rule learning (i.e., individual visual features) are acquired and performed differently in different species.
On the other hand, differences in task performance may arise from external factors, such as training protocols, task settings, prior experiences, and motivations. For example, the high-dimensional rule switches in this WCST design, occurring both intra- and extradimensionally, may be more familiar to humans who routinely manage the trade-off between computational load and efficiency. Additionally, since these experiments are computer-based, the testing environment is likely more intuitive for humans than for monkeys. Moreover, humans explicitly volunteer to participate in a study investigating cognitive functions, whereas laboratory monkeys usually have no clue of the aim of the experiment without extensive training, making it less likely to generate a logical rule to follow. Finally, Goudar et al. (2024) mentioned that monkeys and humans can have different motivational states due to the format of reward in the task (i.e., food vs. correct feedback). This may contribute to observed differences in exploration and exploitation in two species. As a result, training monkeys often requires extensive, multistage processes, making it challenging to directly compare their performance with humans given minimal training in the task. We advocate for minimizing external influences to ensure more accurate interspecies comparisons. The authors have addressed this issue by giving limited explicit instructions to human subjects, aiming to make it comparable with monkeys without verbal instructions.
In conclusion, Goudar et al. (2024) adopted a newly designed WCST task and a hypothesis-free input–output HMM-GLM model to quantify rule inference strategies in humans and monkeys. The findings showed that, while monkeys could flexibly switch rules, they did so more slowly, with more random exploration and reduced error sensitivity compared with human subjects. The model successfully captured these cross-species differences in rule-switching strategies, offering a valuable framework for unbiased comparisons of rule inference across species. This study contributes to a growing trend in cross-species analyses of complex behavior, calling for further research into the neural and circuit mechanisms underlying cognitive functions like rule switching.
Footnotes
Review of Goudar et al.
We thank Dr. Erin Rich and Dr. Teresa Esch for useful feedback on the manuscript drafts and Dr. Ning-long Xu for resources and supervision. This work was supported by the Science and Technology Commission of Shanghai Municipality (Yangfan Program, 24YF2752200) to N.Z.
This Journal Club was mentored by Erin Rich.
The authors declare no competing financial interests.
Editor’s Note: These short reviews of recent JNeurosci articles, written exclusively by students or postdoctoral fellows, summarize the important findings of the paper and provide additional insight and commentary. If the authors of the highlighted article have written a response to the Journal Club, the response can be found by viewing the Journal Club at www.jneurosci.org. For more information on the format, review process, and purpose of Journal Club articles, please see http://jneurosci.org/content/jneurosci-journal-club.
- Correspondence should be addressed to Ningyu Zhang at zhangny{at}ion.ac.cn.