Abstract
Biological neural networks adapt and learn in diverse behavioral contexts. Artificial neural networks (ANNs) have exploited biological properties to solve complex problems. However, despite their effectiveness for specific tasks, ANNs are yet to realize the flexibility and adaptability of biological cognition. This review highlights recent advances in computational and experimental research to advance our understanding of biological and artificial intelligence. In particular, we discuss critical mechanisms from the cellular, systems, and cognitive neuroscience fields that have contributed to refining the architecture and training algorithms of ANNs. Additionally, we discuss how recent work used ANNs to understand complex neuronal correlates of cognition and to process high throughput behavioral data.
Introduction
Recent technological advances have transformed our access to the fine-grain spatiotemporal organization of the anatomy and physiology of biological neural networks. Over the years, big data on an astounding diversity of genes, proteins, neurons and glia, dendrites, synapses, and neural network functions have transformed our understanding of the brain (Sejnowski et al., 2014). On the other hand, brain-inspired implementations of artificial neural networks (ANNs), the perceptron model (McCulloch and Pitts, 1943; Rosenblatt, 1958), Boltzmann machines (Ackley et al., 1985), and Hopfield networks (Hopfield, 1982), have had profound implications for biological research and computational problems. These ANN architectures have been foundational for applications in pattern completion, attractor networks, dynamical systems, and diverse algorithmic capabilities in modern convolutional, multilayer, and recurrent neural network (RNN) models.
Although biological neural networks have continued to guide the development of their artificial counterparts, the beacon has been held by mathematics and statistical physics to develop efficient models of optimization functions (Sutskever et al., 2011; Cox and Dean, 2014). ANNs have leapfrogged from nonlinear systems and networks (Minsky and Papert, 1972; Haykin, 1994) to deep and recurrent networks (LeCun et al., 2015; Schmidhuber, 2015). More recently, backpropagation of error (Werbos, 1974, 1982; Rumelhart et al., 1986) has enabled the efficient training of neural networks, by computing gradients with respect to the weights of a multilayer network. Although methods to train ANNs have evolved to include improved weight initializations, optimization, and gradient descent algorithms, they do not appear to have any analogous neurobiological principles (Marblestone et al., 2016).
Here, we review the current state of the art of ANN models in terms of “biological realism,” their applications and limitations, with the ultimate aim of identifying the operational principles and functional settings through which biological neural networks and ANNs can inform each other toward synergistic development. With this background, the review focuses on four distinct aspects.
How can biological intelligence guide the refinement of ANN architectures?
How can ANNs drive a better understanding of cognition?
What are the limitations of ANNs with respect to modeling human cognition?
What are the recent advances in applying ANNs to quantify complex behavior?
Refining ANNs with biological precision
ANNs share many interesting features in common with biological neural networks. This is, of course, no accident, as the original ANN algorithms were in part inspired by the anatomy of the cerebral cortex (Sejnowski, 2020). The successful use of ANNs to model computations has forced a recalibration of our working models of the nervous system, leading to the embrace of dynamical models of computation that incorporate distributed computations across widespread, ever-changing networks (Sohn et al., 2019). Advances in cellular neuroscience, neuroimaging, and computational modeling are enabling the integration of new details into advanced versions of ANNs that will, hopefully, bring us closer to the goal of understanding how the brain works, while simultaneously refining artificial intelligence (AI).
A prime example of an emerging tension between neuroscience and AI is the recognition that pyramidal neurons, the workhorse of the cerebral cortex and the primary feature mimicked by ANNs, have highly nonlinear operating modes (Larkum, 2013). Traditional models of pyramidal neurons assumed that the dendrites of pyramidal neurons linearly summed their action potentials within a given window, and only spiked when the inputs exceeded a certain threshold (Larkum, 2022). In stark contrast, recent work has clearly demonstrated that many pyramidal neurons in the cerebral cortex have distinct modes of operation, sometimes firing linearly with inputs and other times ignoring inputs altogether (Fig. 1A,B) (Ramaswamy and Markram, 2015; Roelfsema and Holtmaat, 2018; Richards et al., 2019). Rather than reflecting passive integrative inputs, the active dendrites of pyramidal neurons have been shown to underpin striking computational complexity (Johnston and Narayanan, 2008; Spruston, 2008; Poirazi and Papoutsi, 2020; Larkum, 2022). Indeed, deep neural networks with at least 5-8 layers are needed to model the complex input/output functions of pyramidal cells (Beniaguev et al., 2021). The ability to convert the presumed integrator-like dynamics of neurons and their dendrites to coincident detectors (or resonators) is an important function that specific ion channels perform (Rudolph and Destexhe, 2003; Ratté et al., 2013). The identity of individual neurons (integrators vs resonators) has important implications for connectivity, computation, and information coding (Rudolph and Destexhe, 2003; Ratté et al., 2013). Such features have been recently incorporated into ANNs, toward solving the so-called credit-assignment problem (Payeur et al., 2021) using single-phase learning (Fig. 1C,D) (Greedy et al., 2022).
Nonetheless, plasticity, as a biological mechanism, is not limited to synaptic contacts. With emerging roles attributed to entire neurons (engram cells) in the physiology of memory, learning theories that focus exclusively on synaptic plasticity appear to be inadequate as a premise for models of ANNs (Titley et al., 2017; Lisman et al., 2018; Josselyn and Tonegawa, 2020). Plasticity is a ubiquitous phenomenon, which spans multiple scales of organization in biological neural networks: from synapses and dendritic branches to neurons and microcircuits (Le Bé and Markram, 2006; Branco and Häusser, 2010; Titley et al., 2017; Mishra and Narayanan, 2021). Incorporating a broad repertoire of plasticity mechanisms, such as those available to biological neural networks, is an essential step in refining ANN architectures and extending their utility. We are only just scratching the surface of the potential for biological insights to suggest novel algorithmic solutions to problems that have been trained on classical network architectures, such as RNNs.
The cerebral cortex is also deeply embedded within a web of dense interconnections with a number of highly conserved subcortical structures whose functional importance to the working of the nervous system should not be understated. One particular structure that is often overlooked in ANNs is the thalamus, a bilateral structure in the diencephalon that is densely (and heterogeneously) interconnected with the cerebral cortex (Jones, 2001). Although the functional benefits of one class of corticothalamic cell is relatively well understood (the so-called “core” regions) (Crandall et al., 2015), the more diffusely projecting “matrix” cells remain more enigmatic. Recently, a neural mass model of the corticothalamic system was created to investigate the impact of this topological projection on emergent whole-brain dynamics (Shine, 2021). In brief, the model found that the matrix cells tuned the functional repertoire of the brain, providing a flexible, yet robust, platform for instantiating an array of novel combinations of cortical coalitions. Others have shown that these same cells can alter information flow in neural circuits (Anastasiades et al., 2021; Mukherjee et al., 2021) and are crucial sites for behaviorally relevant plasticity (Williams and Holtmaat, 2019). It would be interesting to note how these circuit-level features could inform future implementations of ANNs, such as models that mimic the interactions between the cerebellum and cortex (Pemberton et al., 2021; Boven et al., 2022).
The operating mode of the cerebral cortex (along with the rest of the brain) is also fundamentally altered by the presence (or absence) of neuromodulatory chemicals, such as noradrenaline, acetylcholine, and serotonin. By interacting with GPCRs on target neurons and glia, these ligands can alter the excitability and receptivity of the network (Shine et al., 2021), facilitating different information processing regimens that shift neural populations between information storage and information transfer (Li et al., 2019). These changes in gain, while relatively low-dimensional, can substantially impact the functional outputs of ANNs (Stroud et al., 2018), suggesting that their incorporation into modern deep learning architectures could be quite informative (Mei et al., 2022). In addition, by combining these approaches with sophisticated, high-resolution recordings of the neuromodulatory system in vivo (Breton-Provencher et al., 2022), we can also simultaneously test hypotheses regarding the functional operation of the brain as well.
A prominent example of biologically inspired ANNs that has gained considerable interest in machine vision is the convolutional neural network (CNN). CNNs are extensions of ANNs with an architecture inspired by that of the mammalian visual system, with convolutions representing the function of simple cells and the pooling operations of complex cells (Lindsay, 2021). When trained appropriately, these models can produce representations that match those of biological visual systems better than previous models (Khaligh-Razavi and Kriegeskorte, 2014; Yamins et al., 2014). Traditionally CNNs are strictly feedforward; that is, they do not include lateral or feedback recurrent connections (Fig. 2). Yet, it is known that visual systems of humans contain many such connections, and these connections are implicated in important computations, such as object recognition (Wyatte et al., 2012). Previous work has shown how these connections can make models better at visual tasks and better match biological processing (Fig. 2) (Spoerer et al., 2017; Linsley et al., 2018; Kubilius et al., 2019; Nayebi et al., 2021). An unmet potential of these models, however, is to use them as an idealized experimental setup to analyze the computational role that recurrence plays. Promising work in this direction has shown that recurrence can help object classification by carrying information about unrelated, auxiliary variables (Thorat et al., 2021).
In a recent study (Lindsay et al., 2022), four different kinds of recurrence were added to a feedforward CNN: feedback connections that implement predictive processing to one network, lateral connections that implement surround suppression to another, and two more networks with feedback and lateral connections trained directly to classify degraded images. This choice of task, wherein the network must classify images of digits that are degraded by one of several types of noise, such as occlusion and blur, was chosen to capture some of the functions believed to be performed by recurrence. Counterintuitively, recurrence added to the CNN was not related to its function in these models: both forms of task-trained recurrence (feedback and lateral connections) change neural activity and behavior similarly to each other and differently from their bio-inspired anatomic counterparts. Specifically, in the case of feedback, predictive feedback denoises the representation of noisy images at the first layer of the network, leading to an expected increase in classification performance. In the task-trained networks, representations are not denoised over time at the first layer (indeed, they become “noisier”), yet these dynamics do lead to denoising at later layers and increased performance. We analyzed an open fMRI dataset (Abdelhack and Kamitani, 2018) using the same tools, such as dimensionality reduction, activity correlations, and representational geometry analysis, applied to the models and found weak support for the predictive feedback model. Such analysis of artificial networks provides an opportunity to test the tools of systems neuroscience (Lindsay, 2022).
Decisions, artificial RNNs, and functional neuron types
Many ingredients make up our decisions, a rich stream of sensory information, a lifetime of memories, long-term goals, and current mood or emotions. This poses a challenge in identifying the neural processes of decision formation: The activity of cortical neurons, for instance, reflects an equally large complexity of decision-related features, from sensory and spatial information (Rao et al., 1997), to short-term memory (Funahashi et al., 1989), economic value (Padoa-Schioppa and Assad, 2006), risk (Ogawa et al., 2013) and confidence (Kepecs et al., 2008), or abstract rules (Wallis et al., 2001). Furthermore, single neurons often demonstrate mixtures of these features (Mante et al., 2013; Rigotti et al., 2013; Fusi et al., 2016), precluding straightforward functional interpretations of the signal they carry (Fig. 3). How can we identify any organizational principles by which cortical neurons or neural networks take part in decision-making?
Recent approaches have focused on neural population as the primary computational units for cognition (Pandarinath et al., 2018; Saxena and Cunningham, 2019; Vyas et al., 2020; Barack and Krakauer, 2021; Duncker and Sahani, 2021; Ebitz and Hayden, 2021; Jazayeri and Ostojic, 2021). Population approaches identify low-dimensional patterns in neural population data, describing the subspaces or manifolds in which neural trajectories move (Cunningham and Yu, 2014). Applied to neural population recordings during flexible decision-making as a prime example, information about sensory information, choice, and rules can be reliably separated at the level of neural populations (Mante et al., 2013; Malagon-Vina et al., 2018; Sohn et al., 2019; Aoi et al., 2020; Ebitz and Hayden, 2021).
RNNs, which are able to mimic the complexity of real cortical responses, have served as a valuable model for understanding computation in large heterogeneous neural populations. For instance, RNNs suggest dynamic network mechanisms by which decision rules are flexibly applied to determine a decision (Mante et al., 2013). Hopfield networks and restricted Boltzmann machines provide valuable insight into the storage and retrieval of associative memories via unsupervised learning rules (Marullo and Agliari, 2020). Recently, supervised learning approaches have been used to train RNNs to perform the same cognitive tasks as behaving animals. This approach provides a powerful alternative for studying how neural computations underlying cognitive tasks are distributed across heterogeneous populations (Fig. 3) (Mante et al., 2013; Song et al., 2016; Wang et al., 2018; Yang et al., 2019) and how networks leverage previously learned tasks for continual learning (Driscoll et al., 2022). Because of the complexity of their connectivity and dynamics, reverse engineering trained RNNs mimic the challenges faced when analyzing real neural data. This observation has motivated their use as a testbed for candidate dimensionality reduction methods aimed at uncovering low-dimensional latent dynamics. Such methods model heterogeneous neural responses as linear mixing of task-relevant variables and can uncover neural mechanisms which exist only at the population level (Cunningham and Yu, 2014; Kobak et al., 2016). The ability to perform precise perturbation tests in RNNs (Yang et al., 2019) offers the possibility of validating the causal role of neural representations revealed by candidate dimensionality reduction strategies (Mante et al., 2013; Song et al., 2016; Wang et al., 2018; Yang et al., 2019).
In a recent study, RNNs were trained on cognitive tasks to develop and validate latent circuit models of heterogeneous neural responses (Langdon and Engel, 2022) (Fig. 3). The latent circuit model uncovers low-dimensional task-relevant representations together with recurrent circuit interactions between these representations. To validate this method, RNNs were trained on a motion-color discrimination task in which the subject must flexibly discriminate either the motion or color of random dot stimulus depending on a contextual cue (Langdon and Engel, 2022) (Fig. 3). Fitting a latent circuit model to the responses of this RNN revealed a latent inhibitory mechanism in which contextual representations inhibit irrelevant stimulus representations, allowing the network to flexibly select the correct stimulus–response association (Langdon and Engel, 2022). This inhibitory mechanism is mirrored in dynamics as a suppression of irrelevant stimulus representations.
Despite the success of population-centered analysis, recent studies have discovered groups of single neurons with prototypical dynamic activity and encoding of decision variables. For instance, Hirokawa et al. (2019) started from neural population activity but considered the possibility that the neurons' dynamic activity as well as its tuning to decision variables in a combined sensory and value-based decision task clustered into distinct groups of neurons with distinct dynamic and tuning profiles. Unsupervised clustering revealed dedicated groups of single neurons in rat orbitofrontal cortex that were tuned to canonical decision variables, that is, combinations of task features that explained the animals' decision behavior, such as reward likelihood, integrated value, and choice outcome. A dedicated group of neurons in the orbitofrontal cortex carried information about the certainty that a decision was correct (i.e., decision confidence) (Masset et al., 2020). These neurons predicted subsequent confidence-guided behavior: the variable time rats invested into their decision to obtain an uncertain, delayed reward before abandoning their investment (Lak et al., 2014; Ott et al., 2018). These groups of neurons might constitute functional neuron types, characterized by assuming specific algorithmic roles to realize decision computations (Christensen et al., 2022). Similarly, functional clusters were found in the orbitofrontal cortex during value-based decision tasks in rats (Hocker et al., 2021) and primates (Onken et al., 2019), and in mice using calcium imaging during associative learning (Namboodiri et al., 2019).
When and why might we expect to find functional neuron types? Recent computational studies using RNNs suggest that neural subpopulations with distinct dynamics or categorical representations arise in trained networks that are required for flexible decision-making, such as context-dependent decision tasks (Dubreuil et al., 2022; Flesch et al., 2022; Langdon and Engel, 2022). Functional neuron types might thus be a feature shared by biological neural networks and ANNs to provide a robust computational solution for flexible decision-making. On the other hand, these interpretations are limited, since it is unclear what the biological counterpart to an ANN unit might precisely be. While many approaches interpret RNN units as candidates for single neurons (Barrett et al., 2019), the complex computations performed by single neurons outperform simple RNN units and can only be described by deep networks themselves (Beniaguev et al., 2021). Specifically, the functional coupling between neuronal compartments (dendrites and soma, compare with previous section) can be controlled by thalamic input (Aru et al., 2020), and depends on learning (d'Aquin et al., 2022) further suggesting that RNN units might correspond to computations of neuronal compartments or biophysical processes. Categorical representations in RNNs might thus shed light onto the functions performed by biophysical elements of neurons.
Functional neuron types might emerge as a result of the cortical microcircuit structure. Emerging evidence suggests that cortical cell types, defined by distinct gene expression or connectivity patterns (Tasic et al., 2018; Winnubst et al., 2019), assume specialized functions during decision-making. For example, orbitofrontal cortex neurons that project to the striatum predominantly carry sustained task-related signals (Bari et al., 2019; Terra et al., 2020), such as information about choice outcome (Hirokawa et al., 2019) (whether the animal was rewarded or not), and projection-defined neurons in motor cortex signal movement onset or choice signals, respectively (Economo et al., 2018). Cell type identity might thus be a structural constraint on the dynamic decision algorithms in biological neural networks that could inform the design of ANNs (Sacramento et al., 2018; Greedy et al., 2022).
Uncertainty and decisions: can insights from human (and animal) cognition contribute to AI development?
Humans make decisions based on perceptual, value-based, or other information. Such decisions are accompanied by a sense of confidence. That is, our brains seem to compute not only the best decisional outcome, but also estimates related to the probability that the decision is correct (Pouget et al., 2016; Mamassian, 2022; Peters, 2022). This sense of uncertainty accompanies the moment-to-moment information processing across many perceptual and cognitive domains, and can help any organism decide whether to update their internal models of the world, how to allocate resources, or how to sample new information. Importantly, computing decisional confidence can also help us to better learn from erroneous predictions (Guggenmos et al., 2016; Stolyarova et al., 2019; Ptasczynski et al., 2022).
An argument can thus be made that getting artificial systems to also compute such a confidence judgment could lead not only to better decision-making under uncertainty, but also better and more self-directed learning. A foundational goal of AI research is to build systems that not only behave adaptively in the world, but which “know” when they have made correct or erroneous decisions, or when they have such a high level of uncertainty that they should sample more information before committing to a decision at all. Thus far, most “confidence” type signals in artificial systems typically compute uncertainty estimates according to probabilistic inference: for example, the variance of a (posterior) probability distribution, or entropy of an outcome distribution, can potentially be reasonable proxies for confidence in biological systems (Li and Ma, 2020). This is because these quantities reflect the relative evidence in favor of multiple possible decisional outcomes. However, there are a number of problems with these approaches for uncertainty estimation in artificial systems.
First, it is not clear that humans and other animals rate confidence according to optimal inference, as implemented in ANNs or similar; instead, a large body of work suggests that other influences on decisional confidence are likely, ranging from motor preparation and execution (Fleming et al., 2015) to detectability heuristics (Maniscalco et al., 2016, 2021; Rausch et al., 2018). While these contributions to uncertainty/confidence estimates may seem suboptimal or even random, we also must note that our systems have been optimized through millennia of evolution, such that apparent “biases” in confidence judgments may actually reflect some optimal behavior where the cost function remains unknown to us as researchers (Michel and Peters, 2021).
Second, current implementations of confidence in multidecision alternatives often do not have an option for artificial systems to “opt out” of the decision, and instead decide to sample more information. Current AI does not have agency in such a fashion. However, we know that biological observers use confidence to guide their decision-making behavior, including decisions about whether and how to continue sampling their environments (Kepecs and Mainen, 2012; Guggenmos et al., 2016; Stolyarova et al., 2019; Ptasczynski et al., 2022). One area in which uncertainty-based self-directed information sampling is likely to be of utility is in meta-learning, wherein an artificial system must learn which weights to update based on an inferred context (and thus may avoid catastrophic forgetting when trained on multiple tasks). Several architectures implementing explicit metacognition or confidence have been proposed (Griffiths et al., 2019). For example, in rats trained to report confidence by placing a wager on difficult decisions, single neurons in the frontal cortex encode uncertainty information across multiple sensory modalities, and predict both confidence-scaled time investment in learning (Lak et al., 2014; Masset et al., 2020). These results suggest a generalized representation of confidence as a “summary scalar” that could provide a robust uncertainty signal used for subsequent decisions or learning processes and therefore constitute a precursor of metacognitive signals. The field is ripe for more exploration whether such biological implementations could inform uncertainty predictions in ANNs (Griffiths et al., 2019; Gawlikowski et al., 2021).
Here, we have discussed one among many examples of how artificial system development may benefit from the study of metacognition and confidence in biological systems, and vice versa. Future work may also examine how cooperation between humans and artificial systems may be optimized through confidence-weighted communication, as it is between dyads or small groups of human deciders (Bahrami et al., 2010). Efforts to align vocabularies, literatures, and concepts across the fields of cognitive science and AI will assuredly benefit both fields.
Development of deep learning algorithms for high throughput processing of complex behavior
Studying natural behaviors affords new understanding of how the brain controls motion (Krakauer et al., 2017) and processes sensory inputs (Hayhoe, 2018). But two major characteristics of natural behaviors challenge their use in neuroscience experiments: dynamic properties that are often difficult to quantify and rich repertoires that require processing datasets much larger than tractable with traditional manual or semimanual methods. Modern machine learning offers both unsupervised and supervised approaches to meet these challenges.
Unsupervised algorithms help researchers identify structure in high-dimensional behavior data (Gilpin et al., 2020; McCullough and Goodhill, 2021). For example, researchers applied unsupervised dimensionality reduction, linear projections, or nonlinear embeddings (Tenenbaum et al., 2000; van der Maaten and Hinton, 2008; McInnes et al., 2018) to videos of freely moving worms (Stephens et al., 2008), fish (Mearns et al., 2020), flies (Berman et al., 2014), rodents (Stringer et al., 2019), and primates (Yan et al., 2020). Then, unsupervised clustering could discover robust behavior states in the resulting low-dimensional representations. Additionally, clustering was replaced by hidden Markov models (HMMs), capturing sequences of behavior states (Wiltschko et al., 2015).
Successful approaches using ANNs to process textual, auditory, and visual data, not as models in neuroscience, were recently harnessed and applied to quantify complex behavior in tandem or replacing these machine learning methods. Unsupervised ANNs were used because they better capture various data distributions (Graving and Couzin, 2020); convolutional and variational autoencoders (Batty et al., 2019; Graving and Couzin, 2020; Luxem et al., 2022) performed dimensionality reduction before clustering or fitting HMMs and generative adversarial networks (Goodfellow et al., 2014; Radford et al., 2015) improved the interpolation between low-dimensional representations (Sainburg et al., 2018).
Supervised ANNs are extremely useful in automating manual processing where humans can identify which data features to track and provide labeled training examples. Many striking examples come from pose estimation in movies of freely behaving animals. These algorithms learn to track human-defined anatomic features, such as joints and locations on the body of single or multiple animals, in videos captured from single (Mathis et al., 2018; Graving et al., 2019; Pereira et al., 2019) or multiple (Marshall et al., 2021) cameras. Mostly using deep convolutional ANNs, these supervised methods, like the unsupervised ones, extended and improved previous methods based on supervised classifier models (Dankert et al., 2009; Kabra et al., 2013; Machado et al., 2015).
Together, these ANN-based algorithms ushered the field of computational neuroethology (Anderson and Perona, 2014; Datta et al., 2019; Pereira et al., 2020). However, while many species naturally vocalize and offer a rich window onto complex social interactions, fewer works developed audio analysis methods comparable with those created for video analysis (Sainburg and Gentner, 2021).
Audio analyses predominantly start by converting sound signals to spectrograms, a two-dimensional representation in the time and frequency domains. This “image of sound,” like visual data, was used to extract low-dimensional representations. For example, human-defined features, such as pitch, entropy, and amplitude, were continuously extracted from spectrograms and automated measuring similarities between juvenile and tutor zebra finch songs (Tchernichovski et al., 2000). Unsupervised variational autoencoders were also used for continuous low-dimensional embedding of spectrograms (Goffinet et al., 2021). Still, rather than working on continuous signals, most machine learning tools for bioacoustics were developed for analyzing audio segments, thought to represent basic vocal units or syllables.
Segmenting vocal communication allows creating models of syntax (Berwick et al., 2011; Jin and Kozhevnikov, 2011; Markowitz et al., 2013; Hedley, 2016) and motor learning (Sober and Brainard, 2009, 2012), and to relate syllable acoustics and sequence to neural activity (Leonardo and Fee, 2005; Sober et al., 2008; Wohlgemuth et al., 2010). Researchers used unsupervised similarity metrics (Mets and Brainard, 2018), clustering (Daou et al., 2012; Burkett et al., 2015), embedding (Morfi et al., 2021; Sainburg et al., 2021), variational autoencoders (Kohlsdorf et al., 2020), and other generative deep networks (Pagliarini et al., 2021) to assist human identification of vocal units, visualize repertoire structures (Sainburg et al., 2020), and study their dynamics (Mets and Brainard, 2018; Kollmorgen et al., 2020). When human annotators created training sets of labeled audio segments, those segments were used to train supervised algorithms (Nicholson, 2016), support vector machines (Tachibana et al., 2014), template matching (Anderson et al., 1996), HMMs (Kogan and Margoliash, 1998), and k-nearest neighbors (Nicholson, 2016, 2021), that allowed scaling up analyses on annotated syllables.
Still, these methods require the audio to be a priori well segmented. Traditional segmentation techniques hence create a bottleneck, limiting the questions researchers can answer. Using supervised deep ANNs introduces various solutions to this problem in rodents (Coffey et al., 2019) and songbirds (Koumura and Okanoya, 2016; Steinfath et al., 2021; Cohen et al., 2022). For example, TweetyNet (Cohen et al., 2022) is a supervised deep ANN that leverages temporal vocal dependencies to achieve high-precision annotation of multiple species. TweetyNet offers a powerful tool to study the neuronal encoding of bird song syntax (Cohen et al., 2020) and demonstrates how development in modern machine learning opens new boundaries in the study of natural behavior.
Finally, as different research laboratories, developing and using various ANNs to analyze behavior, also develop their own data formats and algorithms, it is of uttermost importance for our community to develop and foster an ecosystem of interoperable methods to increase reproducibility and access.
In conclusion, we have briefly reviewed the state of the art of ANNs and how their development has been inspired by biological neural networks. Although ANNs are remarkably effective at solving specific tasks, they lack the ability of biological neural networks to generalize robustly across tasks (but see Reed et al., 2022). We suggest that future implementations of ANNs should incorporate some of the intricate multiscale organizing features of biological neural networks to generalize as well as they do and learn continually over a lifetime of experience.
Neuromodulatory systems endow biological neural networks with the ability to learn and adapt to constantly changing behavioral demands. Neuromodulators, such as dopamine, serotonin, noradrenaline, and acetylcholine, play crucial roles in modulating a repertoire of brain states from reward assessment, motivation, patience, arousal, and attention. The diverse phenomenology of neuromodulatory function is yet to be fully explored in ANNs, and their implementation has so far been mostly restricted to models of reinforcement learning (Shine et al., 2021; Mei et al., 2022).
Neurotransmitter and neuromodulator receptors are thought to modulate perceptual processes by activating receptor “hotspots” on the distal apical dendrites of neocortical layer 5 pyramidal cells (Takahashi et al., 2016). Recent implementations of ANNs have incorporated dendritic mechanisms to address the credit-assignment problem (Sacramento et al., 2018). Incorporating neurotransmitter receptor clusters on dendrites in ANNs could help unravel their role in gating perceptual processes, for example, NMDA receptors that are distributed nonlinearly on the dendrites of most neuron types (Chavlis and Poirazi, 2021).
Biological neural networks promote renormalization and homeostasis of synaptic strength during different states of sleep, and facilitate learning and memory through replay. Replay enables the brain to consolidate memory and overcome forgetting of acquired knowledge, also referred to as “catastrophic forgetting” in machine learning. Implementing “sleep-like states” in deep neural networks could mimic biological replay mechanisms and prevent catastrophic forgetting (Roscow et al., 2021; Kudithipudi et al., 2022; Mei et al., 2022; Tsuda et al., 2022).
Recent findings demonstrate that metabolic state dynamically governs coding precision in biological neural networks. Metabolic scarcity in the brain inactivates biological neural networks required for long-term memory to preserve energy (Padamsey et al., 2022). Biological neural networks regulate energy use through intrinsic mechanisms that determine the degree of energy consumption by reducing the impact of subthreshold variability on information coding. Therefore, biological neural networks dynamically adapt their coding precision and energy expenditure in a context-dependent manner (Padamsey et al., 2022). Most ANN architectures are energetically expensive. How could metabolic principles controlling coding precision inform implementations of energy efficient ANNs?
Neuroscience is witnessing significant advances in our understanding of biological learning mechanisms that can continue to inform new avenues for ANNs. We suggest that the machine learning community could adopt these ideas and integrate them into standard ANN frameworks to build a solid foundation, and develop the next generation of ANNs informed by the multiscale organizing features of their biological analogs.
Footnotes
This work was supported by Brain and Behavior Research Foundation National Alliance for Research on Schizophrenia and Depression Young Investigator Award to V.B.-P.; Natural Sciences and Engineering Research Council Discovery Grants DGECR-2021-00293 and RGPIN-2021-03284 to V.B.-P.; Canadian Institute for Advanced Research Azrieli Global Scholar Fellowship to M.A.K.P.; Fonds de Recherche du Québec - Santé salary award 311492 to V.B.-P.; Marie Skłodowska-Curie Global Fellowship Agreement 842492 to S.R.; Newcastle University Academic Track Fellowship to S.R.; Fulbright Research Scholarship to S.R.; German Research Foundation OT562/1-1 and OT562/2-1 to T.O.; Marie Curie Individual Fellowship 844003 to G.W.L.; National Health and Medical Research Council Fellowship GNT1193857 to J.M.S.; Swartz Foundation to C.L.; National Institutes of Health Grants RF1DA055666 to T.A.E. and S10OD028632-01 to T.A.E. and C.L.; and Alfred P. Sloan Foundation Research Fellowship to T.A.E.
The authors declare no competing financial interests.
- Correspondence should be addressed to Vincent Breton-Provencher at vincent.breton-provencher{at}cervo.ulaval.ca or Srikanth Ramaswamy at Srikanth.Ramaswamy{at}newcastle.ac.uk