Abstract
When exposed to rhythmic stimulation, the human brain displays rhythmic activity across sensory modalities and regions. Given the ubiquity of this phenomenon, how sensory rhythms are transformed into neural rhythms remains surprisingly inconclusive. An influential model posits that endogenous oscillations entrain to external rhythms, thereby encoding environmental dynamics and shaping perception. However, research on neural entrainment faces multiple challenges, from ambiguous definitions to methodological difficulties when endogenous oscillations need to be identified and disentangled from other stimulus-related mechanisms that can lead to similar phase-locked responses. Yet, recent years have seen novel approaches to overcome these challenges, including computational modeling, insights from dynamical systems theory, sophisticated stimulus designs, and study of neuropsychological impairments. This review outlines key challenges in neural entrainment research, delineates state-of-the-art approaches, and integrates findings from human and animal neurophysiology to provide a broad perspective on the usefulness, validity, and constraints of oscillatory models in brain–environment interaction.
Rhythms in Neural and Sensory Dynamics
Rhythmic activity is a prominent signature of intra- and extracranial brain recordings (Buzsáki et al., 2013; Jones, 2016). Such patterns, commonly referred to as neural oscillations, have been observed in various species including insects (Popov and Szyszka, 2020; Greenfield and Merker, 2023), rodents (Jacobs, 2014), humans, and nonhuman primates (Buzsáki and Vöröslakos, 2023). Neural oscillations are an endogenous property of neural circuits, that is, they can exist in the absence of any sensory stimulation (Fig. 1a) and even in vitro, when the tissue is isolated from dynamic input (Florez et al., 2015; De la Prida and Huberfeld, 2019). They are thought to reflect coordinated activity of neural ensembles within networks of brain areas, in support of various cognitive functions (Bressler and Kelso, 2001; Buzsáki and Draguhn, 2004; Uhlhaas et al., 2010). For instance, alpha oscillations (∼8–14 Hz) in sensory cortices are assumed to reflect “pulses” of inhibition that support attentional processes (Klimesch et al., 2007; Jensen and Mazaheri, 2010; Jensen, 2024), while beta oscillations (∼15–30 Hz) in motor-related areas have been linked to movement preparation (Barone and Rossiter, 2021) as well as sensory predictive processes (Arnal, 2012; Fujioka et al., 2012). Theta oscillations (∼4–7 Hz) are prevalent in a number of regions, dominating hippocampal activity to support spatial navigation and episodic memories (Herweg et al., 2020; Rudoler et al., 2023) while playing an important role for speech processing in the auditory cortex (Doelling et al., 2014; Zoefel and Kösem, 2024). Gamma oscillations have been ascribed a critical role in sensory processing (Gray, 1999; Buzsáki and Wang, 2012; Vinck et al., 2023), interareal communication (Fries, 2015), and memory (Howard et al., 2003; Griffiths and Jensen, 2023). The importance of these rhythms for healthy brain function is supported by a multitude of studies demonstrating disrupted dynamics in brain disorders associated with cognitive deficits, such as Alzheimer's, schizophrenia, autism, and dyslexia (Uhlhaas and Singer, 2006; Başar, 2013; Rojas and Wilson, 2014; Leicht et al., 2016; van Nifterick et al., 2023).
A neural oscillation is only one of a variety of related phenomena that contribute to phase-locked stimulus responses and that are difficult to dissociate experimentally. a, Endogenous neural oscillations are generated within local brain regions in the course of cognitive processes (generally thought to be higher-frequency oscillations, e.g., gamma) or between regions (lower-frequency oscillations). b, Introducing perceptual information through sensory pathways which contains periodicity, such as music and language, or stimulating the brain directly via electrical or magnetic techniques such as transcranial magnetic stimulation can generate rhythmic brain responses; however, the existence of rhythmic neural activity need not imply involvement of a neural oscillator if activity can be accounted for by a series of evoked responses. c, Sensory or other stimulations might be used to induce neural oscillations (as when a signal magnetic pulse causes a circuit to reverberate at its preferred frequency) without implying entrainment. d, A series of periodic inputs may gradually synchronize and entrain a neural circuit that has resonant properties. Practically, it can be difficult to distinguish between b and d in neural data. Factors that might influence brain regions’ and circuits’ susceptibility to neural entrainment by external stimuli are illustrated in e.
A key scenario in which rhythmic neural activity is observed is during periodic sensory stimulation, in a frequency range that matches the stimulation frequency (and its integer multiples, called harmonics), which is well documented in invasive and noninvasive electrophysiological recordings (Adrian and Matthews, 1934; Walter and Walter, 1949; Kimura, 1980). This presence of rhythmic activity can substantiate as periodic responses, increased spectral amplitude, or a consistent relationship between neural and stimulation phase. A prominent hypothesis is that these rhythmic brain responses reflect coupling of endogenous neural oscillations to the rhythmic external drive (Lakatos et al., 2008). This notion relies on the assumption that spontaneous neuronal oscillations are self-sustained oscillators as defined in dynamical systems theory, with corresponding properties such as entrainment and resonance (Pikovsky et al., 2001; Helfrich et al., 2019; Strogatz, 2019; van Bree et al., 2022). Mechanistically, it has been proposed that the high-excitability phase of neural oscillations becomes aligned to relevant events in a rhythmic sequence, thereby boosting their processing (Lakatos et al., 2008; Schroeder and Lakatos, 2009). This theory has become highly influential and applied to many scientific questions that entail dynamic interaction with the environment. For example, in current models of speech perception (Giraud and Poeppel, 2012; Meyer et al., 2020; Poeppel and Assaneo, 2020; Kazanina and Tavano, 2023; Zoefel and Kösem, 2024), the similarity between frequencies of neural oscillations and the various “building blocks” in human speech (e.g., prosody, words, syllables, and phonemes) has inspired the idea that oscillations parse sensory information or shape its representation by synchronizing to external rhythms (Henry and Obleser, 2012; Gnanateja et al., 2022). In general, the oscillatory entrainment hypothesis provides an elegant and efficient explanation of how brain dynamics can interact with those in the environment.
The term “entrainment” originates from dynamical systems theory and assumes the involvement of an oscillator coupled to a periodic input either uni- or bidirectionally, a definition we adopt here unless stated otherwise (for more details, see Definitions of terms are variable). However, this term has been expanded and is now used liberally in cognitive neuroscience to describe any case of phase-locked neural responses (Lakatos et al., 2019; Obleser and Kayser, 2019). Although more neutral terms like “tracking” have been suggested (cf. Banki et al., 2022), “entrainment” continues to be a popular label for stimulus-aligned brain responses without strong evidence for an oscillatory origin. For some scientific applications of rhythmic sensory stimulation, knowledge of this origin may indeed not be required. One example is the concept of “frequency tagging,” where the rhythmic brain response is used as a readout of the participant's attentional state. In this context, the term “entrainment” is less frequently used (Regan, 1966; Müller et al., 1998; Zhigalov et al., 2019; Drijvers et al., 2021). Applications for rhythmic brain responses can also be found in clinical research, e.g., in the identification of biomarkers (Sivarao, 2015; Javitt et al., 2020) and as a therapeutic tool in the treatment of Alzheimer's disease (Iaccarino et al., 2016). Yet in many cases, it is essential to determine the involvement of entrained oscillations, as the computational principles that feature in models, such as phase-based encoding or prediction, rely on properties of oscillators such as period correction and resonance.
The mechanistic origins of neural entrainment, and in particular the role of neural oscillations, have proven to be controversial, with diverging results and positions in the literature (Keitel et al., 2014, 2022; Haegens and Zion Golumbic, 2018; Zoefel et al., 2018; Meyer et al., 2020; Doelling and Assaneo, 2021; van Bree et al., 2022). For example, identification of endogenous neural oscillations—i.e., an involvement of a neural circuit that can produce rhythmic activity on its own (Fig. 1a)—during rhythmic stimulation is not always straightforward. Thus, the model of neural oscillations entrained to sensory rhythms faces multiple challenges that can seem to question the usefulness of such a model. Considering the prevalence of rhythmic stimulation in cognitive neuroscience and the diverging evidence in its effect on neural mechanisms, this review aims to outline significant challenges in the research on neural entrainment in sensory systems, as well as novel approaches to overcome methodological concerns and miscommunication. We highlight studies that have employed these approaches to provide evidence for an involvement of endogenous oscillations in the processing of rhythmic input as well as the constraints on such mechanisms and critically discuss the usefulness of an “oscillator” to study neural dynamics.
Challenges in the Study of Neural Entrainment
Definitions of terms are variable
A key issue in the field of neural entrainment is variability in its definitions. While the terms “synchronization,” “oscillation,” and “entrainment” have mathematical definitions in dynamical systems theory (Pikovsky et al., 2001), they are more ambiguous in cognitive neuroscience. According to dynamical systems theory, “entrainment” involves an active (uni- or bidirectional) influence between oscillators, whereas “synchronization” implies a zero-lag phase relationship between two processes but does not necessarily involve oscillators (Bittman, 2021). The definition of these terms is often altered in cognitive neuroscience so that “entrainment” simply refers to neural responses phase-locked to sensory events (cf. the “broad sense” in Obleser and Kayser, 2019). Likewise, “synchronization” includes other phase lags besides zero to consider potential delays between the occurrence of an event and its neural processing. Finally, rhythmic signals are often termed “oscillations” regardless of the source of rhythmicity, a challenge we describe in “Both endogenous rhythms and other stimulus-related signals may contribute to stimulus-locked brain activity” and illustrate in Figure 1. The importance of embracing the mathematical definition of these terms in cognitive neuroscience has been discussed previously (Helfrich et al., 2019; Lakatos et al., 2019; Obleser and Kayser, 2019). Rather than reiterating the relevance of these “true” definitions, we here propose that authors’ hypotheses on the underlying mechanisms should be detailed as precisely as possible when studying phase-locked responses to time-varying signals. Outlining these assumptions could be guided by questions like: Does the study assume that endogenous oscillatory dynamics underlie the observed response to stimulation? Can this oscillatory rhythm be observed in the absence of the stimulus, i.e., does it emerge from a neural circuit that produces spontaneous oscillations within the investigated frequency band? What are the neural mechanisms through which the external stimulus may or may not be able to modulate activity in the circuit? As we will argue in this review, providing answers to these questions will make future studies more consistent, replicable, and easier to interpret.
Properties of neural generators are rarely considered
A large portion of the literature on neural entrainment is based on electroencephalography (EEG) and magnetoencephalography (MEG) data from human participants or on extracellular fields measured in rodents and primates. These methods do not provide access to the low-level circuitry underlying the recorded rhythms, leading to ambiguity in how experimenters define entrainment at the neuronal level. Does rhythmic stimulation need to modulate the postsynaptic potential or spiking of individual neurons in the circuit, or should it influence the complex dynamical processes generating the oscillation? One example is the 40 Hz gamma rhythm, which has become a popular target for entrainment by visual stimulation (Iaccarino et al., 2016). This work is opposed by studies arguing that spontaneously generated, native gamma oscillations do not entrain to a visual flicker (Duecker et al., 2021; Schneider et al., 2023; Soula et al., 2023). To solve the divergence in the literature, should we consider the endogenous oscillator to be entrained only if the stimulation affects the circuit of pyramidal neurons and interneurons responsible for generating gamma rhythms spontaneously? Clearly stating the criteria for evidence of entrainment in each study may help reconcile the conflicting findings in the literature.
Furthermore, entrainment implies that properties of the neural circuit govern which stimulus features and rates drive its activity, amplifying only certain stimuli for downstream processing. Neurons and their ensembles are selective to features, complexity, frequency, and timing of sensory input and therefore do not respond equally to each type of incoming information (Fig. 1e). For a neuronal circuit to be entrained by rhythmic sensory stimulation, the synaptic input should be sensitive to the input and faithfully maintain its timing. However, nonlinear transformations of the sensory input during low-level processing modifies the signal as it propagates to higher-order cortical regions (Schneider et al., 2023; Gautam et al., 2024), potentially preventing or altering entrainment in areas beyond the sensory systems.
In addition, some oscillations may have functional properties to which an adaptation to sensory input would be counterproductive, such as when they provide a temporal structure to neural processing (Lisman and Jensen, 2013; ten Oever et al., 2024). In this case, the oscillation might, at the risk of losing internal stability, not adapt immediately to the timing of sensory input. In practice, however, such an “unentrainability” is difficult to demonstrate, as it might simply result from the mismatch between entraining stimulus and the “preference” of the oscillator, described in the preceding paragraph. Nevertheless, the presence of oscillatory dynamics in a cortical region at rest does not allow any inference about whether the corresponding neural generator can be entrained by a stimulus. At the same time, a lack of evidence for entrainment does not suffice to refute the presence of a neural oscillator which may oscillate but not entrain. Both the presence and entrainability of an oscillator must be established independently.
Both endogenous rhythms and other stimulus-related signals may contribute to stimulus-locked brain activity
It is a recurring question whether rhythmic brain activity during rhythmic stimulation reflects an entrainment of neural oscillations to the stimulus (Fig. 1d) over and above the responses that are inherently evoked by each individual event in the rhythmic sequence and together form a regular pattern that reflects the rhythmicity of the stimulus (Fig. 1b; Walter and Walter, 1949; Keitel et al., 2014; Zoefel et al., 2018). This question is not just one of the semantics as the underlying mechanisms will have distinct effects on downstream processing and cognitive functions (Breska and Deouell, 2017a; Doelling and Assaneo, 2021). Considering that sensory systems respond to a wide range of stimulation frequencies (Herrmann, 2001; Brugge et al., 2009; Duecker et al., 2021) but also selectively amplify sensory rhythms at certain frequencies (Picton et al., 1987; Herrmann, 2001), it is likely that both endogenous oscillatory dynamics and stimulus-evoked responses contribute to brain responses during rhythmic stimulation, and it is therefore challenging to separate them in a given recording. Notably, this issue is not restricted to the interpretation of the neurophysiological data. If rhythmic behavioral responses (e.g., detection of a target) are measured during a rhythmic stimulus, it cannot be ruled out that these are due to masking effects at regular moments in time (e.g., a target might be easier to detect during gaps in a rhythmic sequence). Furthermore, if rhythmic behavioral responses are observed during transcranial alternating current stimulation (tACS), often assumed to entrain neural oscillations (Herrmann et al., 2013), then the behavioral rhythm might simply reflect the alternating current applied (most current injected at the peak and trough of the tACS signal) rather than an endogenous neural oscillation (Zoefel, 2018; van Bree et al., 2021).
Periodicities in neural activity and behavior can also be explained by processes linked to temporal anticipation in the periodic occurrence of events, such as interval-based prediction. The brain is capable of learning an association between a cue and a specific interval initiated by it, as is classically demonstrated in eyeblink conditioning (Christian and Thompson, 2003). It was also shown that this ability goes beyond motor timing, as predictions from cue–interval associations can also proactively guide attentional preparation (Coull and Nobre, 1998). This effect is accompanied by adjustment of ramping activity and anticipatory modulations of band-limited activity, e.g., in alpha and beta bands (Miniussi et al., 1999; Rohenkohl and Nobre, 2011). Critically, recent work that directly compared the behavioral and neural expressions of aperiodic interval-based prediction with those of rhythmic streams found that they were overlapping during stream presentation. This included both phase alignment of low-frequency oscillations that match the rhythm/interval frequency and amplitude modulations at various frequencies (Breska and Deouell, 2017b). Therefore, as a periodic stream is inherently composed of a set of concatenated intervals, it is difficult to rule out that periodic alignment and fluctuations do not reflect repeated operation of such nonoscillatory, discrete interval prediction mechanisms.
Approaches to Identify Neural Entrainment
Addressing these challenges, we highlight here recent attempts to identify neural mechanisms underlying phase-locked neural responses to rhythmic stimulation. Given their prominence in the field, we focus on experimental settings that are designed to identify endogenous oscillations. Our aim is to emphasize that the phenomenon of neural phase-locking to a rhythmic stimulus should be studied and considered on a case-by-case basis to identify the underlying principles under which the neural response to rhythmic stimulation can be categorized as neural entrainment.
We organize this section into approaches attacking distinct components of oscillatory behavior as defined by dynamical systems theory. These approaches have often led to competing answers even within the same domain, in which case we discuss possible explanations. Finally, we conclude by synthesizing these findings across domains to characterize the diversity of underlying mechanisms within the capabilities of neural function and how these transformations help to construct the neurocognitive experience.
Changes in preexisting oscillatory dynamics by an external drive
One approach to identify endogenous oscillations in entrained brain responses in EEG and MEG recordings is to focus on changes in oscillatory dynamics that were present prior to the rhythmic external drive. Notbohm et al. (2016) investigated the effect of visual flicker on oscillations in the alpha band. Prior to the flicker, the authors identified the individual alpha frequency for each participant based on the resting-state EEG. They then tested how this frequency relates to the flicker frequencies and intensities that led to the strongest phase-locking between the EEG signal and the stimulus. Indeed, the authors demonstrated strongest phase-locking if the visual flicker was centered at the individual alpha frequency and furthermore that higher amplitudes of flicker were required to synchronize alpha at wider ranges of frequencies. This represents a known property of physical oscillators (Pikovsky et al., 2001) commonly referred to as Arnold tongues, which clearly demonstrate the eigenfrequency of the oscillator and how its internal dynamics govern its response to stimulus rates at or near this “preferred” frequency (Fig. 2a).
a, Resonance zones known as Arnold tongues govern oscillatory dynamics. Oscillators have preferred frequencies (f-endog), wherein the frequency range over which oscillations may be induced is a function of the amplitude of the stimulus. Oscillators are more difficult to entrain at nonpreferred frequencies. b, Investigating changes in oscillatory dynamics by sensory stimulation (left, surrogate time series right, spectrogram). Top panel, Properties of oscillatory dynamics (e.g., peak frequency) are identified in the interval before the rhythmic stimulus. Middle panel, If an endogenous oscillator is successfully entrained by a stimulus, it changes its frequency accordingly (within its preferred range). Bottom panel, A comic illustration of results from Duecker et al. (2021). Endogenous gamma oscillations were not entrained and coexisted with flicker responses in the MEG signal.
Many studies have demonstrated preferred stimulus rates. The auditory system, for example, seems to have several preferred stimulus rates: one closer to the typical syllable rate of speech (∼4–8 Hz; Poeppel and Assaneo, 2020; L’Hermite and Zoefel, 2023; Zoefel and Kösem, 2024) and the other ∼40 Hz, at which frequency a response known as the auditory steady state response (ASSR) can be robustly driven (Galambos et al., 1981; Picton et al., 1987; Ross et al., 2003). In contrast to the visual domain, alpha and beta bands do not phase-lock well to auditory stimuli at matching rates (Teng et al., 2017; Weisz and Lithari, 2017; Teng and Poeppel, 2020). Additional preferred frequencies appear to exist in higher-frequency bands centered on 80 and 200 Hz, which may relate to processing vocal pitch (Tichko and Skoe, 2017; Coffey et al., 2021). Although the presence of such preferred rates speaks for an involvement of a neural oscillator, there are scenarios in which evoked responses can have similar preferences. For instance, if responses evoked by individual events in a regular stimulus overlap, their summed response will be higher than if they do not overlap, leading to a preferred rate even without underlying oscillatory mechanism (Edwards and Chang, 2013). A demonstration of Arnold tongues, i.e., a restricted range of phase-locking which widens with increased stimulus amplitude (Fig. 2a), is therefore crucial for the identification of neural entrainment.
Following a similar logic, Duecker et al. (2021) combined MEG with a rapid visual flicker (>50 Hz) to investigate entrainment and resonance in the gamma band (Fig. 2b). This stimulation technique has gained increasing popularity as rapid invisible frequency tagging to probe cortical excitability with high temporal resolution while reducing the visibility of the flicker (Zhigalov et al., 2019; Pan et al., 2021; Minarik et al., 2023). When applied to an invisible patch in the background color, i.e., in the absence of identifiable gamma oscillations prior to the stimulation, the rapid flicker evoked identifiable responses to stimulation frequencies of up to 80 Hz. However, the visual cortex did not seem to selectively amplify frequencies in the stimulation range (Duecker et al., 2021). To investigate whether the visual flicker can entrain endogenous oscillations, Duecker and colleagues first induced gamma oscillations using a moving grating stimulus (Hoogenboom et al., 2006, also van Pelt et al., 2012). The rapid flicker was then imposed on the grating, which allowed the authors to investigate changes in ongoing oscillations. Importantly, there was no evidence that the phase or frequency of the grating-induced gamma rhythm synchronized to the flicker, but instead, the two activities seemed to coexist in the early visual cortex (Fig. 2b, bottom). Indeed, MEG beamforming localized the flicker response to the primary visual cortex, while the gamma oscillations were strongest in the secondary visual cortex. Moreover, the frequency inducing the strongest flicker response was robustly lower than the peak frequency of the gamma oscillations. These findings have been supported by a study investigating temporal response functions in the gamma band in response to a broadband flicker applied to moving gratings (Zhigalov et al., 2021). The authors found that the peak frequency of the perceptual “gamma echo” was significantly lower than the frequency of the endogenous gamma oscillations.
Duecker et al. (2021) hypothesized that entrainment may have been prevented by the low-pass filter properties of the visual system (Hawken et al., 1996; Connelly et al., 2016). Moreover, the flicker may have been unable to modulate the activity of the inhibitory interneurons that are known to be critically involved in generating gamma oscillations (Traub et al., 1996). Indeed, these hypotheses have since been supported by intracranial recordings in mice, showing that a visual flicker at 40 Hz is attenuated across the ventral stream (Schneider et al., 2023). This finding was further attributed to the low-pass filter properties of cortical pyramidal neurons (Schneider et al., 2023; Soula et al., 2023). Moreover, while gamma oscillations have long been hypothesized to coordinate communication in the visual system (Fries, 2015), recent work suggests that, instead, gamma oscillations carry information about the predictability of the visual input (Peter et al., 2019; Vinck et al., 2023). As such, the mechanisms underlying gamma oscillations in the visual cortex may have prevented entrainment by external stimuli.
Entrainment echoes after the stimulus offset
Another approach to leverage the oscillator's predicted temporal dynamics is to focus on “entrainment echoes”—rhythmic brain responses that are produced by a rhythmic stimulus but outlast it briefly, reflecting reverberation of the oscillating circuit (Hanslmayr et al., 2014; van Bree et al., 2021). As these echoes are measured after (rather than during) a rhythmic stimulus, alternative explanations (such as regular evoked responses; see Both endogenous rhythms and other stimulus-related signals may contribute to stimulus-locked brain activity; Fig. 1b) are more straightforward to rule out. A methodological caveat for the identification of entrainment echoes is the “temporal smearing” induced by spectral analysis methods (de Cheveigné and Nelken, 2019) required to estimate the frequency, amplitude, or phase of the signal (e.g., filtering, wavelet analysis, FFT) and which can produce spurious entrainment echoes by artificially “prolonging” stimulus-related activity beyond the stimulus offset. Nevertheless, multiple studies have demonstrated such echoes even when this issue was controlled for (Hanslmayr et al., 2014; Spaak et al., 2014; Hickok et al., 2015; Lerousseau et al., 2021).
Van Bree et al. (2021) tested for entrainment echoes in MEG after intelligible or unintelligible noise-vocoded speech, presented rhythmically at 2 or 3 Hz. They found that rhythmic MEG responses, specific to the stimulation rate, outlasted the rhythmic speech, but only when it was intelligible. These echoes seemed to originate from the cerebellum and trigger connectivity with left inferior frontal regions (Zoefel et al., 2024). As described in this review, speech is not the only acoustic stimulus that can produce evidence for entrainment. The speech specificity of the effect should therefore not be interpreted as a demonstration that only speech can entrain oscillations in audition. However, it implies that neurons involved in oscillatory circuits may be more or less sensitive to specific stimulus features and illustrates that not all stimuli produce the same aftereffects (see Properties of neural generators are rarely considered). Another line of studies used rhythmic tone or noise sequences to test for entrainment echoes in auditory perception. Rhythmic changes in the detection of a short pure tone were only observed after the offset of rhythmic sequences when they were presented between 2 and 8 Hz (Farahbod et al., 2020; L’Hermite and Zoefel, 2023), a frequency range critical for human communication and music perception. This finding demonstrates, in addition to a presence of entrainment echoes, preferred rates for auditory perception (see Changes in preexisting oscillatory dynamics by an external drive). Finally, tACS at 3 Hz leads to rhythmic changes in the accuracy of word report at the corresponding rate that outlasts the electric stimulation (van Bree et al., 2021). This finding represents evidence that tACS can also entrain endogenous oscillations as often assumed (Herrmann et al., 2013).
It is important to note that not all studies have identified signatures of entrainment echoes in the low-frequency range. Oscillatory reverberation should facilitate detection of near-threshold targets appearing at on-beat times when compared with detection of off-beat targets or those presented after aperiodic stimulation. However, no such patterns were found in some cases (Lin et al., 2022; Sun et al., 2022; cf. Keitel et al., 2022). L’Hermite and Zoefel (2023) showed that entrainment echoes in audition might be organized tonotopically. Accordingly, moments of most accurate target detection after a rhythmic (∼6–8 Hz) tone stimulus depended on the difference in sound frequency between entrainer and target. Moreover, for identical sound frequencies, target detection was most accurate at off-beat times when the stimulation rate was constant across trials but on-beat when the rate was variable, an effect interpreted as repetition-related habituation only during constant stimulation. Thus, how entrainment echoes manifest in perception might be more complex than thought and depends on stimulus properties as well as their larger context. This conclusion might explain some of the null effects described.
Another recent study combined this approach with computational modeling and MEG, to study the role of oscillatory mechanisms in speech perception (Oganian et al., 2023). The focus was on perception of naturalistic speech, which has typical nonisochronous temporal dynamics, rather than rhythmized speech as in, e.g., poetry. The authors asked whether the well known observation of phase alignment between human scalp recording and the speech envelope (Luo and Poeppel, 2007) reflects oscillatory entrainment, or alternatively, encoding by evoked responses. Generative computational models of these two mechanisms provided time-resolved predictions of the spectral and temporal dynamics of phase alignment during stimulus-free pauses in the speech signal. Critically, only the oscillator model predicted stable phase alignment during pauses, resulting from oscillatory reverberation. These predictions were then compared with MEG data recorded while participants were listening to the same speech stimulus as used in model simulations. The results indicated that both the spectral distribution and the temporal dynamics of phase alignment were in line with the evoked response model, with no sustained phase alignment due to oscillatory reverberation. These findings stand against the idea that oscillatory entrainment is involved in speech perception in natural settings, where speech mostly deviates from isochrony. On the other hand, some of the above-described entrainment echoes were specific to intelligible speech (van Bree et al., 2021; Zoefel et al., 2024), which instead would suggest that an entrainment mechanism tailored to process human speech does exist. Although other reasons for this discrepancy still need to be identified, it again highlights the strong context dependence of oscillatory mechanisms (Fig. 1e) and their interaction with both stimulus properties and other nonoscillatory processes.
Phase-locked responses to higher stimulus rates (∼80–500 Hz) observed during stimulus presentation are often termed the frequency following response (FFR; Fig. 3). FFRs are closely tied to periodicity present in the environment, which is critical for perceiving vocal communication and music (Coffey et al., 2019; Krizman and Kraus, 2019). The transfer of pitch information is observable noninvasively using EEG/MEG as high-frequency neural–evoked responses (Coffey et al., 2016). Recently, several works have reported that the FFR extends beyond the stimulus offset by 3–4 cycles, even when signals are isolated from specific brain regions (Fig. 3, second row). These techniques preclude the explanation that an apparent echo might be generated by signals with different degrees of lag summating within a scalp-recorded EEG signal (Coffey et al., 2021; Lerousseau et al., 2021). This finding represents evidence for entrainment echoes in the auditory system at higher frequencies and demonstrates that FFR involves entrained neural oscillations.
The FFR to the speech syllable /da/ (first row) shows several cycles of oscillations after the stimulus offset, i.e., an entrainment echo (second row). Computing the amplitude over successive small time windows shows that the peak amplitude is reached only after 4–5 cycles (third row) and that the frequency converges from a hypothetical preferred frequency to the stimulus frequency over time and relaxes back toward the preferred frequency after the stimulus offset (fourth row). Modified from Coffey et al. (2021).
Temporal dynamics of entrainment
By definition, neural entrainment occurs during stimulus processing. As such, it is important to identify the unique features of oscillatory dynamics in response to the input that can both indicate its presence and affect downstream processing. For example, neural entrainment and stimulus-evoked responses differ in how their temporal dynamics unfold. Whereas evoked responses are typically maximal at the onset of rhythmic stimulation and habituate subsequently, a neural oscillator might require some time to entrain to the stimulation (Fig. 1d). Such an effect has been observed in epidural recordings from the prefrontal cortex of conscious rats during acoustic 40 Hz stimulation (Fig. 4), a “preferred” rate for auditory cortical circuits (see Changes in preexisting oscillatory dynamics by an external drive). The onset of stimulation produced a strong evoked response that rapidly attenuated (Fig. 4, top panel), while the phase synchrony to the stimulus at ∼40 Hz developed slowly over the course of hundreds of milliseconds (Fig. 4, bottom panel, left; Ummear Raza et al., 2023; Gautam et al., 2024). Importantly, whereas evoked responses manifest in a comparable fashion in the primary auditory and the prefrontal cortices (Gautam et al., 2024), the development of the ∼40 Hz phase synchrony in the prefrontal cortex lagged markedly behind the former. Therefore, the slow evolution in phase synchrony appears to reflect a property underlying oscillatory circuitry. Interestingly, temporal phase dynamics and performance can vary between driving frequency and harmonics (Gautam et al., 2023; Swerdlow et al., 2024), suggesting the presence of multiple and divergent rhythm-sensitive networks. For example, in the rodent prefrontal cortex, while robust evoked responses were noted to click trains at 10, 20, and 40 Hz, strong synchrony emerged only at ∼40 Hz.
Top panel, An epidurally recorded prefrontal 40 Hz ASSR averaged from a group of 11 female SD rats. Vertical lines mark stimulus period. Bottom panel, The single EEG epochs used to generate the 40 Hz ASSR (top panel) were band pass (38–42 Hz) filtered and overlaid to highlight the evolution of 40 Hz synchrony. Note the delayed emergence of 40 Hz synchrony.
Ross and colleagues were the first to highlight the temporal divergence between evoked responses and phase synchrony at 40 Hz in human volunteers (Ross et al., 2002). They speculated that delayed phase synchrony may be indexing a higher-order function such as temporal integration. Integrating discrete stimuli over time may subserve functions like pattern recognition and predictive coding (Fuster, 2001; Wolff et al., 2022). Since it takes 200–300 ms for gamma phase synchrony to establish in the prefrontal cortex in response to 40 Hz click trains, it is speculated that a minimum of 8–12 clicks may be necessary for rhythm registration in this paradigm.
The FFR, described in Entrainment echoes after stimulus offset, can also be distinguished as oscillatory through other features of its temporal dynamics (Coffey et al., 2021; Fig. 3). First, similar to the response to 40 Hz stimulation in rodents (Fig. 4), the FFR amplitude increases over time, peaking after several cycles of input (Fig. 3, third row). Furthermore, frequency-tracking analysis, in which the fundamental frequency of successive overlapping windows is extracted and then plotted over the course of the stimulus, revealed that despite the stimulus having a static fundamental frequency of 98 Hz, the FFR appeared to converge to the stimulation frequency and then diverge once again following the stimulus offset, as expected from an entrained oscillator with a preferred frequency at ∼80 Hz (Fig. 3, fourth row; compare Fig. 2b). This pattern was most pronounced in data extracted from the auditory cortex, whereas FFR generators located in the thalamus and below tended to show tracking frequencies that better matched those of the incoming stimulus, also pointing to a specific neural origin of auditory pitch-related oscillatory entrainment. In a separate experiment, responses were measured to stimuli that were directly preceded by stimuli with higher or lower pitches in a random stream. The results suggested that the frequency of the previous stimulus affects the amplitude and frequency of the subsequently presented stimulus for several cycles.
Together, the temporal dynamics of the neural response during rhythmic stimulation begin to be understood and entail positive evidence for endogenous oscillations in dynamic sensory processing in the cases studied here. These oscillation-specific dynamics seem to occur at frequencies that resemble “preferred” ones for oscillatory circuits, converging with approaches described in Changes in preexisting oscillatory dynamics by an external drive (see also Kaya and Henry, 2022, for a paradigm that combines the oscillator's preferred rate with its ability to adapt to sensory input). A focus on the temporal evolution of neural dynamics can reveal entrained oscillations even when they cannot be measured in the absence of stimulation.
Modeling phase dynamics during rhythmic stimulation
One of the key drivers of oscillator utility is the ability to reduce the behavior of complex networks as governed by a single variable, its phase. Pushing a pendulum in one direction at a particular phase will increase its speed; pushing it at another phase, it will reduce it (Fig. 5). These phase dynamics are critical not only toward identifying an oscillatory entrainment but also toward the theoretical utility of this process in temporal cognition, like parsing and prediction. For this reason, it is critical to identify how entrainment may differ from alternative sources of synchrony and what function this may supply to downstream processing. An important approach to this end has been the careful study of competing computational models and their comparison with both neurophysiological and behavioral data.
Phase dynamics during stimulation is a key feature of oscillatory entrainment. The panel plots the phase response curve of a nonlinear neural mass model oscillator (Wilson and Cowan, 1972) when receiving pulsatile stimuli. The model's response (whether it lags or advances and by how much) to pulsatile stimulation depends on the phase at which the stimulation occurs with one regime that causes phase lags (yellow) and another that causes phase advancement (red). This leads to interesting dynamics when scaled to sequences of stimulation: two “null phases” which have zero phase shift when stimulated, one an attractor (filled circle) and one a repeller (unfilled circle). These dynamics drive synchrony toward the attractor phase and are a defining feature of oscillatory function. By modeling the phase dynamics of neural recordings, we can begin to distinguish dynamics of oscillatory and nonoscillatory sources in the brain.
Doelling et al. (2019) established this approach in music perception, highlighting that theories of entrainment to support either auditory segmentation (Giraud and Poeppel, 2012) or prediction (Arnal et al., 2015; Morillon et al., 2016) require a relatively consistent phase relationship with the stimulus to be effective, regardless of the stimulus rate. They compared an oscillator against a linear response model in their phase consistency with music across a range of note rates. In doing so, they found that MEG data of participants listening to the same music had higher phase concentration, in line with the oscillator model. The results were consistent across two experiments and three sets of musical pieces. These findings supported an oscillatory entrainment hypothesis in the prediction of musical notes in natural music.
The FFR has also been modeled as a canonical model of mode-locked neural oscillations, which do successfully predict the nonlinear responses to musical intervals observed in human data (Lerud et al., 2014). However, evoked response models (Bidelman, 2015) and feed-forward delay models (Tichko and Skoe, 2017) are also generally effective at producing signals that closely resemble recorded FFRs.
More recently, Doelling et al. (2023) examined corresponding effects on behavior. They studied a temporal prediction task in an “imprecisely isochronous” sequence of tones, where the interval between tone onsets is normally distributed and centered on a specific period, akin to syllable durations in speech. They found that participants behaved in accordance with Bayesian principles, showing evidence for a prior expectation for rhythmicity in sequences. They tested a number of mechanistic models to replicate this behavior, a simple ramp, a predictive ramp (Egger et al., 2020), a nonlinear oscillator (Wilson and Cowan, 1972), and an adaptive frequency oscillator (AFO). They found that only the two oscillator models could support this Bayesian prior for rhythmicity and only the AFO could do so at the range of rates reflected in experimental data (1–5 Hz). This capacity to mimic Bayesian computation comes from the phase response curve of the oscillator which is demonstrated in Figure 5 (see caption). This study has two key findings: (1) entrainment biases perception toward rhythmicity, imposing rhythmic structure on perception, and (2) the base form of a nonlinear oscillator that is too simple to support perception; added (nonoscillatory) components support oscillatory dynamics to extend their utility to a wider range of stimuli.
Ruling out alternative mechanisms through the study of selective neuropsychological impairments
In some cases, it is difficult to attribute neural dynamics during stream presentation to oscillatory entrainment or other processes, even using computational models. For example, as rhythms inherently consist of concatenated intervals, it is not possible to design a periodic stimulus that would not contain interval information. In such cases, even subtraction of a control condition (e.g., aperiodic intervals; Breska and Deouell, 2017b) may not be sufficient, as it is difficult to determine whether neural patterns that are specific to periodic stimulation do not reflect facilitated operation of interval prediction mechanisms.
A unique solution for this challenge is offered by studying neurological patients, specifically with cerebellar dysfunction (CD). While the cerebellum has traditionally been considered part of the motor system, modern research has implicated it in high cognitive functions (Sokolov et al., 2017) and, relevant here, in timing and temporal prediction. Critically, recent behavioral work has shown that CD patients are selectively impaired in interval timing and prediction, and not in rhythm-based timing (Grube et al., 2010; Breska and Ivry, 2018, 2021). Therefore, CD patients enable studying entrainment during periodic stimulation while ascertaining that interval-based mechanisms are not involved or explain neural dynamics. In a first study that applied this rationale, Breska and Ivry (2020) measured EEG in CD patients and age-matched neurotypical controls. They first verified that CD patients were impaired in interval-based temporal prediction and indeed found reduced levels of phase alignment at the stimulus frequency relative to controls, as well as reduced behavioral benefit of predictive interval cues. Then, they showed that in a periodic condition, CD patients showed similar degree of phase alignment as controls, which was also stronger than the patients’ phase alignment in interval prediction. These findings establish that phase alignment during periodic stimulation does reflect rhythm-specific oscillatory mechanisms rather than interval prediction (Breska and Ivry, 2020).
Conclusion
Is an oscillator a useful model for how neural circuits respond to rhythmic stimulation? As we have shown, the answer to this question strongly depends on the cognitive task, sensory domain, and investigated neural mechanism. Our survey has found evidence for and against neural entrainment as defined as neural oscillations phase-locked to sensory rhythms. While oscillations are ubiquitous in extracellular and noninvasive brain recordings, their underlying generators likely arise from an array of distinct mechanisms, some of which behave like physical oscillators, and some do not. Moreover, oscillatory dynamics represent only one tool in the arsenal of neural dynamics deployed by the brain to support perception and cognition. It therefore behooves us as researchers to consider how such oscillations interact with other nonoscillatory components to support cognition, to better understand where they can be useful and where they should be left to the side. One consequence of this conclusion is that oscillator models of neural circuits should be informed by the known properties of neurophysiological mechanisms (Fig. 1e). For instance, the ability of a stimulus to entrain endogenous oscillations will depend on its frequency and/or the neural circuitry underlying the oscillation. As such, asking whether an oscillator is a useful model to describe neuronal oscillations is too simplistic, as the answer depends on the complex combination of parameters of the investigated circuit. Addressing the challenges described here will bring us closer to understanding neural rhythms and their role in the processing of sensory ones.
Footnotes
K.B.D. is funded by the Fondation pour l’Audition (RD-2020-10) and a French government grant managed by the Agence Nationale de la Recherche under the France 2030 program (ANR-23-IAHU-0003). E.B.J.C. is funded by the Natural Sciences and Engineering Research Council of Canada the Fonds de recherche du Québec. A.B. is funded by the Max-Planck Society, Germany. B.Z. is funded by the Fondation pour l’Audition (FPA-RD-2021-10) and the Agence Nationale de la Recherche (ANR-21-CE37-0002).
↵*K.D., K.B.D., A.B., E.B.J.C., D.V.S., and B.Z. contributed equally to this work.
The authors declare no competing financial interests.
- Correspondence should be addressed to Benedikt Zoefel at benedikt.zoefel{at}cnrs.fr.











