Computational models in the age of large datasets
Introduction
By nature, experimental biologists collect and revere data, including the myriad details that characterize the particular system they are studying. At the [same time], as the onslaught of data increases, it is clear that we need tools that allow us to crisply extract understanding from the data that we can now generate. How do we find the general principles hiding among the details, and how do we understand which details are critical features of a process, and which details can be approximated or ignored while still permitting insight into an important biological question? Intelligent model building coupled to disciplined data analyses will be required to progress from data collection to understanding.
Computational models differ in their objectives, limitations and requirements. Conceptual models examine the consequences of broad assumptions. These kinds of models are useful for conducting rigorous thought experiments: one might ask how noise impacts latency in a forced choice between multiple alternatives [1], or how network topology determines the fusion and rivalry of visual percepts [2]. While conceptual models must be constrained by data in the sense that they cannot violate known facts about the world, they do not strive to assimilate or reproduce detailed experimental measurements. Phenomenological data-driven models aim to capture details of empirically observed data in a parsimonious way. For example, reduced models of single neurons [3, 4] can often capture the behavior of neurons, but with simplified dynamics and few parameters. These kinds of models are useful for understanding ‘higher level’ functions of a neural system, be it a dendrite, a neuron or a neural circuit [5••] that, in the appropriate context, are independent of low-level details. Used carefully, they can tell us biologically relevant things about how nervous systems work without needing to constrain large numbers of parameters. Detailed data-driven or ‘realistic’ models attempt to assimilate as much experimental data as are available and account for detailed observations at the same time. Successful examples might include detailed structural models of ion channels that capture voltage-sensing and channel gating [6], or carefully parameterized models of biochemical signaling cascades underlying long-term potentiation [7]. With notable exceptions, models of this kind are often the least satisfying, as they can be most compromised by what has not been measured or characterized [8••].
How should we approach computational modeling in the era of ‘big data’? The non-linear and dynamic nature of biological systems is a key obstacle for building detailed models [8••, 9••] even when large amounts of data are available. For example, even well-characterized neural circuits such as crustacean CPGs that have full connectivity diagrams have not, to date, been successfully modeled in a level of detail that incorporates all of what is known about the synaptic physiology, intrinsic properties and circuit architecture [10]. As a consequence, there is still a big role for conceptual models that tell investigators what kinds of processes may underlie the data [11], or, more importantly, what potential mechanisms one should rule out [12, 13•].
Section snippets
Relating data to models
The Hodgkin-Huxley [14] model stands almost alone in its level of impact and in the way it achieved a more-or-less complete fit of the data. In hindsight their success came from extraordinarily good biological intuition about how action potentials are generated and a clever choice of experimental preparation. Their model revealed fundamental principles of how a ubiquitous phenomenon — the spike, or action potential — resulted from few processes, namely two voltage-dependent membrane currents
Conceptual models as tools for explaining data and asking ‘what if?’
The mammalian prefrontal cortex (PFC) is one of the most complex and mysterious structures in neuroscience. Single-unit activity from tens to hundreds of neurons reveals a diverse and puzzling array of activity profiles during behavioral tasks, with no obvious relation to external variables. Faced with a snapshot of data from a miniscule and only loosely identified population of neurons, a recent study was nonetheless successful in shedding light on how behavioral output can be represented in
Conflict of interest statement
Nothing declared.
References and recommended reading
Papers of particular interest, published within the period of review, have been highlighted as:
• of special interest
•• of outstanding interest
Acknowledgements
This work was funded by NIH grant MH 46742 and the Charles A King Trust.
References (66)
- et al.
A modeling framework for deriving the structural and functional architecture of a short-term memory microcircuit
Neuron
(2013) - et al.
Universally sloppy parameter sensitivities in systems biology models
PLoS Comput Biol
(2007) - et al.
Extracting falsifiable predictions from sloppy models
Ann N Y Acad Sci
(2007) - et al.
Sensory cortical population dynamics uniquely track behavior across learning and extinction
J Neurosci
(2014) - et al.
Multialternative drift-diffusion model predicts the relationship between visual fixations and choice in value-based decisions
Proc Natl Acad Sci U S A
(2011) - et al.
Network symmetry and binocular rivalry experiments
J Math Neurosci
(2014) - et al.
Adaptive exponential integrate-and-fire model as an effective description of neuronal activity
J Neurophysiol
(2005) - et al.
A balance equation determines a switch in neuronal excitability
PLoS Comput Biol
(2013) - et al.
Mechanism of voltage gating in potassium channels
Science
(2012) Molecular computation in neurons: a modeling perspective
Curr Opin Neurobiol
(2014)
Why are nonlinear fits to data so challenging?
Phys Rev Lett
Understanding circuit dynamics using the stomatogastric nervous system of lobsters and crabs
Annu Rev Physiol
A diversity of localized timescales in network activity
eLife
Ruling out and ruling in neural codes
Proc Natl Acad Sci U S A
Decoding neural responses to temporal cues for sound localization
eLife
A quantitative description of membrane current and its application to conduction and excitation in nerve
J Physiol
Prediction of repetitive firing behaviour from voltage clamp data on an isolated neurone soma
J Physiol
A quantitative description of dendritic conductances and its application to dendritic excitation in layer 5 pyramidal neurons
J. Neurosci
Signal propagation in Drosophila central neurons
J Neurosci
Emergence of adaptive computation by single neurons in the developing cortex
J Neurosci
Similar network activity from disparate circuit parameters
Nat Neurosci
Multiple models to capture the variability in biological neurons and networks
Nat Neurosci
Many parameter sets in a multicompartment model oscillator are robust to temperature perturbations
J Neurosci
Identifying crucial parameter correlations maintaining bursting activity
PLoS Comput Biol
How multiple conductances determine electrophysiological properties in a multicompartment model
J Neurosci
Cell-intrinsic mechanisms of temperature compensation in a grasshopper sensory receptor neuron
eLife
Parameter estimation and model selection in computational biology
PLoS Comput Biol
Failure of averaging in the construction of a conductance-based neuron model
J Neurophysiol
Multiple mechanisms switch an electrically coupled, synaptically inhibited neuron between competing rhythmic oscillators
Neuron
High prevalence of multistability of rest states and bursting in a database of a model neuron
PLoS Comput Biol
Estimating parameters and predicting membrane voltages with conductance-based neuron models
Biol Cybern
Rectifying electrical synapses can affect the influence of synaptic modulation on output pattern robustness
J Neurosci
Efficient estimation of detailed single-neuron models
J Neurophysiol
Cited by (67)
Signal propagation in complex networks
2023, Physics ReportsLearning to represent continuous variables in heterogeneous neural networks
2022, Cell ReportsCitation Excerpt :This is the case in models of the occulomotor system, one of the first systems that has been hypothesized to rely on continuous attractor dynamics (Arnold and Robinson, 1991; Seung, 1996; Seung et al., 2000; Fisher et al., 2013), as well as in other networks that were trained on a wide variety of tasks and for which attractor dynamics emerge though training (Mante et al., 2013; Rajan et al., 2016; Sorscher et al., 2020; Cueva et al., 2019; Maheswaranathan et al., 2019). Recently, these models gained remarkable attention as they allow to match neuronal data and incorporate many biological features (O’Leary et al., 2015). However, the ability to comply with the complexity of the biology comes with the expense of tractability.
Statistical analysis and optimality of neural systems
2021, NeuronCitation Excerpt :Our framework dovetails with other approaches that address the issues of ambiguity of theoretical predictions and model identifiability given the limited data in biology. “Sloppy-modeling” (O’Leary et al., 2015; Gutenkunst et al., 2007), grounded in dynamical systems theory, characterizes the dimensions of the parameter space that yields qualitatively similar behavior of the system. In our framework, these dimensions correspond to regions of the parameter space of equal or similar utility.
A regularization method for the parameter estimation problem in ordinary differential equations via discrete optimal control theory
2021, Journal of Statistical Planning and Inference