Computational models in the age of large datasets

https://doi.org/10.1016/j.conb.2015.01.006Get rights and content

Highlights

  • Computational models will prove increasingly useful for understanding large datasets.

  • Substantial challenges exist for fitting detailed models to data.

  • Conceptual and phenomenological models are often more useful than detailed models.

Technological advances in experimental neuroscience are generating vast quantities of data, from the dynamics of single molecules to the structure and activity patterns of large networks of neurons. How do we make sense of these voluminous, complex, disparate and often incomplete data? How do we find general principles in the morass of detail? Computational models are invaluable and necessary in this task and yield insights that cannot otherwise be obtained. However, building and interpreting good computational models is a substantial challenge, especially so in the era of large datasets. Fitting detailed models to experimental data is difficult and often requires onerous assumptions, while more loosely constrained conceptual models that explore broad hypotheses and principles can yield more useful insights.

Introduction

By nature, experimental biologists collect and revere data, including the myriad details that characterize the particular system they are studying. At the [same time], as the onslaught of data increases, it is clear that we need tools that allow us to crisply extract understanding from the data that we can now generate. How do we find the general principles hiding among the details, and how do we understand which details are critical features of a process, and which details can be approximated or ignored while still permitting insight into an important biological question? Intelligent model building coupled to disciplined data analyses will be required to progress from data collection to understanding.

Computational models differ in their objectives, limitations and requirements. Conceptual models examine the consequences of broad assumptions. These kinds of models are useful for conducting rigorous thought experiments: one might ask how noise impacts latency in a forced choice between multiple alternatives [1], or how network topology determines the fusion and rivalry of visual percepts [2]. While conceptual models must be constrained by data in the sense that they cannot violate known facts about the world, they do not strive to assimilate or reproduce detailed experimental measurements. Phenomenological data-driven models aim to capture details of empirically observed data in a parsimonious way. For example, reduced models of single neurons [3, 4] can often capture the behavior of neurons, but with simplified dynamics and few parameters. These kinds of models are useful for understanding ‘higher level’ functions of a neural system, be it a dendrite, a neuron or a neural circuit [5••] that, in the appropriate context, are independent of low-level details. Used carefully, they can tell us biologically relevant things about how nervous systems work without needing to constrain large numbers of parameters. Detailed data-driven orrealistic’ models attempt to assimilate as much experimental data as are available and account for detailed observations at the same time. Successful examples might include detailed structural models of ion channels that capture voltage-sensing and channel gating [6], or carefully parameterized models of biochemical signaling cascades underlying long-term potentiation [7]. With notable exceptions, models of this kind are often the least satisfying, as they can be most compromised by what has not been measured or characterized [8••].

How should we approach computational modeling in the era of ‘big data’? The non-linear and dynamic nature of biological systems is a key obstacle for building detailed models [8••, 9••] even when large amounts of data are available. For example, even well-characterized neural circuits such as crustacean CPGs that have full connectivity diagrams have not, to date, been successfully modeled in a level of detail that incorporates all of what is known about the synaptic physiology, intrinsic properties and circuit architecture [10]. As a consequence, there is still a big role for conceptual models that tell investigators what kinds of processes may underlie the data [11], or, more importantly, what potential mechanisms one should rule out [12, 13•].

Section snippets

Relating data to models

The Hodgkin-Huxley [14] model stands almost alone in its level of impact and in the way it achieved a more-or-less complete fit of the data. In hindsight their success came from extraordinarily good biological intuition about how action potentials are generated and a clever choice of experimental preparation. Their model revealed fundamental principles of how a ubiquitous phenomenon — the spike, or action potential — resulted from few processes, namely two voltage-dependent membrane currents

Conceptual models as tools for explaining data and asking ‘what if?’

The mammalian prefrontal cortex (PFC) is one of the most complex and mysterious structures in neuroscience. Single-unit activity from tens to hundreds of neurons reveals a diverse and puzzling array of activity profiles during behavioral tasks, with no obvious relation to external variables. Faced with a snapshot of data from a miniscule and only loosely identified population of neurons, a recent study was nonetheless successful in shedding light on how behavioral output can be represented in

Conflict of interest statement

Nothing declared.

References and recommended reading

Papers of particular interest, published within the period of review, have been highlighted as:

  • • of special interest

  • •• of outstanding interest

Acknowledgements

This work was funded by NIH grant MH 46742 and the Charles A King Trust.

References (66)

  • M.K. Transtrum et al.

    Why are nonlinear fits to data so challenging?

    Phys Rev Lett

    (2010)
  • E. Marder et al.

    Understanding circuit dynamics using the stomatogastric nervous system of lobsters and crabs

    Annu Rev Physiol

    (2007)
  • R. Chaudhuri et al.

    A diversity of localized timescales in network activity

    eLife

    (2014)
  • A.L. Jacobs et al.

    Ruling out and ruling in neural codes

    Proc Natl Acad Sci U S A

    (2009)
  • D.F. Goodman et al.

    Decoding neural responses to temporal cues for sound localization

    eLife

    (2013)
  • A.L. Hodgkin et al.

    A quantitative description of membrane current and its application to conduction and excitation in nerve

    J Physiol

    (1952)
  • J.A. Connor et al.

    Prediction of repetitive firing behaviour from voltage clamp data on an isolated neurone soma

    J Physiol

    (1971)
  • M. Almog et al.

    A quantitative description of dendritic conductances and its application to dendritic excitation in layer 5 pyramidal neurons

    J. Neurosci

    (2014)
  • N.W. Gouwens et al.

    Signal propagation in Drosophila central neurons

    J Neurosci

    (2009)
  • R.A. Mease et al.

    Emergence of adaptive computation by single neurons in the developing cortex

    J Neurosci

    (2013)
  • A.A. Prinz et al.

    Similar network activity from disparate circuit parameters

    Nat Neurosci

    (2004)
  • E. Marder et al.

    Multiple models to capture the variability in biological neurons and networks

    Nat Neurosci

    (2011)
  • J.S. Caplan et al.

    Many parameter sets in a multicompartment model oscillator are robust to temperature perturbations

    J Neurosci

    (2014)
  • A. Doloc-Mihu et al.

    Identifying crucial parameter correlations maintaining bursting activity

    PLoS Comput Biol

    (2014)
  • A.L. Taylor et al.

    How multiple conductances determine electrophysiological properties in a multicompartment model

    J Neurosci

    (2009)
  • F.A. Roemschied et al.

    Cell-intrinsic mechanisms of temperature compensation in a grasshopper sensory receptor neuron

    eLife

    (2014)
  • G. Lillacci et al.

    Parameter estimation and model selection in computational biology

    PLoS Comput Biol

    (2010)
  • J. Golowasch et al.

    Failure of averaging in the construction of a conductance-based neuron model

    J Neurophysiol

    (2002)
  • G.J. Gutierrez et al.

    Multiple mechanisms switch an electrically coupled, synaptically inhibited neuron between competing rhythmic oscillators

    Neuron

    (2013)
  • B. Marin et al.

    High prevalence of multistability of rest states and bursting in a database of a model neuron

    PLoS Comput Biol

    (2013)
  • C.D. Meliza et al.

    Estimating parameters and predicting membrane voltages with conductance-based neuron models

    Biol Cybern

    (2014)
  • G.J. Gutierrez et al.

    Rectifying electrical synapses can affect the influence of synaptic modulation on output pattern robustness

    J Neurosci

    (2013)
  • Q.J. Huys et al.

    Efficient estimation of detailed single-neuron models

    J Neurophysiol

    (2006)
  • Cited by (67)

    • Learning to represent continuous variables in heterogeneous neural networks

      2022, Cell Reports
      Citation Excerpt :

      This is the case in models of the occulomotor system, one of the first systems that has been hypothesized to rely on continuous attractor dynamics (Arnold and Robinson, 1991; Seung, 1996; Seung et al., 2000; Fisher et al., 2013), as well as in other networks that were trained on a wide variety of tasks and for which attractor dynamics emerge though training (Mante et al., 2013; Rajan et al., 2016; Sorscher et al., 2020; Cueva et al., 2019; Maheswaranathan et al., 2019). Recently, these models gained remarkable attention as they allow to match neuronal data and incorporate many biological features (O’Leary et al., 2015). However, the ability to comply with the complexity of the biology comes with the expense of tractability.

    • Statistical analysis and optimality of neural systems

      2021, Neuron
      Citation Excerpt :

      Our framework dovetails with other approaches that address the issues of ambiguity of theoretical predictions and model identifiability given the limited data in biology. “Sloppy-modeling” (O’Leary et al., 2015; Gutenkunst et al., 2007), grounded in dynamical systems theory, characterizes the dimensions of the parameter space that yields qualitatively similar behavior of the system. In our framework, these dimensions correspond to regions of the parameter space of equal or similar utility.

    View all citing articles on Scopus
    View full text