Abstract
Recent studies suggest that time estimation relies on bodily rhythms and interoceptive signals. We provide the first direct electrophysiological evidence suggesting an association between the brain's processing of heartbeat and duration judgment. We examined heartbeat-evoked potential (HEP) and contingent negative variation (CNV) during an auditory duration-reproduction task and a control reaction-time task spanning 4, 8, and 12 s intervals, in both male and female participants. Interoceptive awareness was assessed with the Self-Awareness Questionnaire (SAQ) and interoceptive accuracy through the heartbeat-counting task (HCT). Results revealed that SAQ scores, but not the HCT, correlated with mean reproduced durations with higher SAQ scores associating with longer and more accurate duration reproductions. Notably, the HEP amplitude changes during the encoding phase of the timing task, particularly within 130–270 ms (HEP1) and 470–520 ms (HEP2) after the R-peak, demonstrated interval-specific modulations that did not emerge in the control task. A significant ramp-like increase in HEP2 amplitudes occurred during the duration-encoding phase of the timing but not during the control task. This increase within the reproduction phase of the timing task correlated significantly with the reproduced durations for the 8 s and the 4 s intervals. The larger the increase in HEP2, the greater the under-reproduction of the estimated duration. CNV components during the encoding phase of the timing task were more negative than those in the reaction-time task, suggesting greater executive resources orientation toward time. We conclude that interoceptive awareness (SAQ) and cortical responses to heartbeats (HEP) predict duration reproductions, emphasizing the embodied nature of time.
- contingent negative variation (CNV)
- duration reproduction
- heartbeat-evoked potential (HEP)
- heartbeat-counting task
- time perception
Significance Statement
Recent fMRI meta-analyses have confirmed that the insula, the primary interoceptive cortex, is one of two regions in the brain essential for processing time passage. It has also been shown that the heart rate influences subjective duration. The notion of embodiment in this context suggests that the interoceptive system creates the sense of time. Besides the neuroanatomical and heart-rate correlations with subjective time, not much is known about the heart–brain interactions while individuals judge duration. Here we show how specific components of the heartbeat-evoked potential in the EEG are related to performance in a duration-reproduction task. A ramp-like increase in HEP2 amplitudes across estimated durations predicts duration reproduction, a neural signature not found in a control reaction-time task.
Introduction
Timing mechanisms in the brain are essential for an organism to understand temporal environmental patterns. While the neural basis of space perception is well established, neural structures of time perception are less understood. Meta-analyses of functional magnetic resonance imaging (fMRI) studies implicate the involvement of subcortical areas in the subsecond duration range, while cortical regions are more active with suprasecond durations (Nani et al., 2019; Teghil et al., 2019), indicating a network of distributed cerebral systems for time perception (Teki et al., 2011; Buhusi et al., 2018).
Teghil et al. (2020a,b) indicate how parallel neural timing systems are activated depending on the specifics of the task. In their experimental setup, temporal intervals of several seconds were filled with regularly or irregularly presented visual stimuli. Timing performance in the irregular condition was modulated by the participants’ degrees of interoceptive awareness [assessed with the Self-Awareness Questionnaire (SAQ); Longarzo et al., 2015]. Moreover, participants who were more accurate in reproducing time intervals in the irregular condition not only had higher scores in interoceptive awareness, but they showed stronger connectivity of the posterior insula with other brain regions. This shows how the insular cortex, which is the primary interoceptive cortex, plays a role in time perception. The interpretation favored by the authors was that the ability to sense their body states enabled participants to be more accurate in the duration-reproduction task when no regular external cues were available.
The two most recent meta-analyses with over 100 neuroimaging experiments using activation-likelihood estimates for brain structures converge on two brain areas: the (pre)SMA and bilateral insula (Mondok and Wiener, 2023; Naghibi et al., 2023). These results correlate with the idea that subjective time is driven by sensorimotor or enactive (SMA) processing (Balasubramaniam et al., 2021) and interoceptive or embodied (insula) experience (Craig, 2009a).
Over a decade ago, Craig (2009a) had proposed a conceptual link between bodily states, subjective time, and insular cortex function, which has since gained neurophysiological support (Wittmann, 2013). Utilizing a duration-reproduction paradigm, Wittmann et al. (2010, 2011) reported a ramp-like increase (climbing activity) in fMRI activation in the posterior insula terminating with the end of the stimulus during the presentation of the first interval (the encoding phase). In the second interval, when participants were required to reproduce the duration of the first stimulus, similar climbing activity was noted in the anterior insula.
Further research integrating peripheral–physiological measures confirmed the autonomic nervous system's role in time perception, with changes in cardiac interbeat intervals and skin conductance observed across several seconds in auditory and visual tasks (Meissner and Wittmann, 2011; Otten et al., 2015) or with cardiac dynamics influencing contraction or expansion of time (Arslanova et al., 2023; Sadeghi et al., 2023). Additionally, the heartbeat-evoked potential (HEP), marking cortical responses to cardiac signals (Montoya et al., 1993; Park et al., 2016), has been shown to predict timing estimates for intervals lasting 120 s (Richter and Ibáñez, 2021). These findings indicate that interoceptive processes are fundamental to time perception across milliseconds to seconds.
The heartbeat-counting task (HCT), as defined by Schandry (1981), requires individuals to count their felt heartbeats without pulse monitoring over a set of intervals. Studies have reported a link between accurate HCT performance and the timing tasks (Meissner and Wittmann, 2011; Richter and Ibáñez, 2021), although this association was not observed in older participants (Otten et al., 2015). The validity of this task as a measure of interoceptive accuracy has been recently questioned (Desmedt et al., 2018, 2020; Zamariola et al., 2018, 2019). Many limitations of the HCT were revealed by Desmedt et al. (2018, 2020) who demonstrated that the successful task outcome is influenced by time- and knowledge-based strategies under its original instructions and is less dependent on actual heartbeat perception. All this research suggested that the validity of the task might be improved by using adopted instructions that prompt participants only to count their felt heartbeat (Desmedt et al., 2020), which we aimed to implement.
Gradual increases in activity, described as “climbing activity,” have been observed in the brain and heart's response to timing tasks, as reported in both fMRI and psychophysiological studies (Wittmann, 2013), as well as in the EEG research, especially in the SMA (Casini and Vidal, 2011). The latter includes the contingent negative variation (CNV), a slow negative shift observed in both explicit (duration estimation) and implicit (reaction time) timing tasks (Macar and Vidal, 2003; Röhricht et al., 2018; Pfeuty et al., 2019). The CNV was initially found to be related to the waiting time before a timed motor action and had subsequently been interpreted as representing a memory trace for explicitly learned durations. However, the role of CNV, traditionally linked to an internal timing mechanism, has since then been debated due to inconsistencies across studies, leading to the suggestion that it reflects broader time-based decision-making processes rather than a specific timing mechanism (Kononowicz and van Rijn, 2011; Van Rijn et al., 2011).
In the present study, we employed a multifaceted approach to investigate time perception, utilizing an auditory duration-reproduction task that required participants to reproduce the duration of presented tones, alongside a secondary working-memory task to prevent counting strategies. A reaction-time task was used as a control. During these tasks, we collected EEG data and monitored heartbeats to examine CNV and HEP components. Interoceptive performance was evaluated with the assessment of the trait-like interoceptive awareness questionnaire (SAQ). The participants’ performance in the classic heartbeat-counting task was also assessed. We hypothesized that a stronger body signal representation, marked by (1) more accurate heartbeat-tracking ability, (2) greater HEP modulation during time estimation, and (3) higher SAQ score will be associated with better duration-reproduction accuracy. Additionally, we expect a more negative CNV amplitude in the time reproduction task than in the reaction-time task, as a reflection of greater executive resource dedication toward time. Preliminary findings prompted further exploratory analyses to elaborate on the intricate relationship between interoceptive and bodily processing and time perception.
Materials and Methods
Participants
We tested 30 healthy participants (16 males, 14 females) aged 18–36 years (mean age, 22.9; SD = 4.2 years) recruited via an online advertisement, flyers, and word-of-mouth. We excluded all participants who reported having any diagnosed neurological disorders. Additionally, they were screened for any use of medication that would indicate underlying cardiovascular issues. The study was approved by the local ethics committee of the Institute for Frontier Areas of Psychology and Mental Health (IGPP_2021_05).
Study procedure
The schematic illustration of the experimental session and tasks used is presented in Figure 1. Participants first filled out the SAQ. Following a 5 min baseline recording with eyes open, participants performed a heartbeat-counting task and subsequently a duration-reproduction or a reaction-time task in a counter-balanced order (Fig. 1a). The latter two tasks were designed in three blocks, respectively. During pauses between blocks and tasks, participants could stand, walk, or rest. All tasks were programmed and implemented with the experimental software PsychoPy (version 2021.1.4; Peirce et al., 2019). The experimental session took approximately 3 h.
Overview of the experimental design. The schematic illustration of (a) the study design, (b) duration-reproduction task, and (c) reaction-time task. For details, see the Materials and Methods section.
Duration-reproduction task
Figure 1b depicts the duration-reproduction task with encoding and reproduction phases. During the encoding phase, participants were presented with an auditory stimulus (pink noise) lasting for a specific target duration (4, 8, and 12 s). Then, during the reproduction phase, the second tone (brown noise) was presented, and participants had to press the button to stop it when the duration matched to the encoding sound. There was a pause between the two stimuli with a jittered time interval between 2 and 3 s. Participants were advised against counting to ensure accuracy relied on perception rather than rhythmic strategies (Hinton et al., 2004; Riemer et al., 2022). To further mitigate the potential counting strategies, a secondary working memory task incorporated within the primary time reproduction task. At the onset of each trial, four different digits from 1 to 8 were shown on the screen for 3 s. Following the completion of the primary task's duration reproduction (pressing the button on the reproduction phase), a single digit was presented for 3 s, and the participant was required to indicate whether this number was among the initial four digits shown at the beginning. They were also asked to react to encoding sound as quickly as possible at the end of the encoding phase by pressing a key. This enabled us to compare the encoding and the reproduction phases with the control reaction-time task (see below), respectively.
The auditory stimuli were created with the software Audacity (version 3.0.0, 2021). Pink and brown noises were used for their nonirritating quality over time and to distinguish between encoding and reproduction phases, preventing confusion during extended trials. Inter-trial intervals were set with a jittered duration between 1 and 2 s. A total of 126 trials were randomly presented with 42 trials allocated for each target duration.
Control reaction-time task
During this task, auditory stimuli (pink noise) with 4, 8, or 12 s durations were presented, and participants were instructed to press a button as fast as possible when the stimulus ended (Fig. 1c). The same secondary working-memory task explained in the duration-reproduction task was included in each trial. This made the control task comparable in timing-irrelevant cognitive load to the encoding phase. The 126 trials were presented in random order.
Heartbeat-counting task
During this task, participants were instructed to count their heartbeats during the given time intervals. They were requested not to monitor their pulse (e.g., by feeling their radial artery on the wrist) but intuitively feel their heartbeats. To mitigate the confounding impact of time- or knowledge-based strategies (Desmedt et al., 2018, 2020), we utilized adapted instructions by asking participants explicitly not to guess the number of heartbeats or count for estimating the given time interval but to focus only on counting their felt heartbeats. A beep indicated the start and the end of the interval while participants had to count their felt heartbeats. At the end of each time window, participants typed their heartbeat counts on a keyboard and indicated (with a mouse click on a visual slider scale) how confident they were, ranging between “very unsure” to “very sure.”
There were three training trials with randomly selected durations (25, 30, and 35 s) and six experimental trials with durations of 25, 30, 35, 40, 45, and 50 s in randomized order. The objective heartbeat counts were recorded by electrocardiogram (ECG), which was part of the EEG acquisition setup.
Questionnaires
The subjective awareness of body states was assessed with the German version (SAQ-17; Kübel and Wittmann, 2020) of the Self-Awareness Questionnaire (SAQ, Longarzo et al., 2015). Participants score how often they generally feel specific body sensations with answer categories ranging between 0 (“never”) and 4 (“always”) on 17 Likert scale items.
Signal recording
Continuous EEG signals were recorded using a 32-channel electrocap (actiCHamp, Brain Vision) with active electrodes positioned according to the extended 10–10 international system (American Electroencephalographic Society, 1993). All electrodes were referenced to the Fz electrode with the ground electrode placed on the forehead. Electrode impedances were kept below 10 kΩ. EEG signals were digitized with a 1,000 Hz sampling rate and bandpass filtered within the 0.01–120 Hz range. One ECG signal was acquired using three Ag/AgCl electrodes, which were positioned according to the Lead II Einthoven configuration: two electrodes were placed on the right clavicle, and the left hip/abdomen (active electrodes), and one electrode was placed on the left clavicle (ground electrode).
Data processing
The data processing was implemented with the Matlab software (R2022b, MathWorks) using custom-written scripts and the EEGLAB toolbox functions. After down-sampling the recorded EEG signals to 250 Hz and applying an FIR filter within the 0.05–70 Hz range, line noise was adaptively estimated and removed using a set of Slepian filters or multitappers. After this initial filtering step, large nonstationary artifacts were identified and corrected using the Artifact Subspace Reconstruction (ASR) approach (Chang et al., 2020). The cleaned EEG signals were then rereferenced to the average reference and decomposed into independent components (ICs) using the Adaptive Mixture Independent Component Analysis (AMICA; Palmer et al., 2006, 2011). Since the low-frequency EEG can affect ICA decomposition results (Winkler et al., 2015), we calculated the ICs using 1–70 Hz bandpass filtered data and applied it to 0.05–70 Hz bandpass filtered data. The identified ICs were then automatically classified and labeled using the ICLabel plugin. This machine-learning approach has been trained to classify the ICs based on several characteristics, such as spectral properties and brain topography (Pion-Tonachini et al., 2019). Nonbrain-related ICs were selected for exclusion based on their spectra, scalp maps, and time courses. On average, this process eliminated nine ICs per subject, effectively cleansing the EEG signals of eyeblinks, muscle noise, heart artifacts, and other types of interference. Epoch extraction was carried out separately based on the component of interest.
We rereferenced the cleaned EEG signals to the average of linked mastoids for the CNV signal analysis. To compare the reaction time and the encoding phase of duration-reproduction tasks, the EEG signals of both tasks were epoched around the onset of the encoding sound from 1 s before to 12 s after the onset. The average ERPs over the frontocentral scalp electrodes (Fz, FC1, Cz, FC2, F1, C1, C2, F2, and FCz) were then considered for the CNV signal comparison (for reference, see Kononowicz et al., 2015; Robinson and Wiener, 2021). We also contrasted CNV signals between the reproduction phase of the duration-reproduction and the reaction-time tasks. The frontocentral EEG signals of both tasks were first epoched, time locked to the onset of the sound (encoding sound in the reaction-time task and reproduction sound in the time-reproduction task) with 3 s before to 15 s after the sound. The extracted segments were baseline corrected considering a 500 ms presound time window. From the obtained segments, final epochs were extracted time locked to motor responses in both tasks (motor response to the button press) with 6 s (for 4 s intervals), 8 s (for 8 s intervals), and 10 s (for 12 s intervals) before the motor response to 1 s after that. These intervals were chosen based on the mean reproduced time of these target durations.
We extracted the CNV components (initial CNV or iCNV and late CNV or lCNV) to assess the modulation of CNV more precisely by both tasks. For the iCNV component, we calculated the average amplitude of the extracted frontocentral ERPs within 500–1,500 ms after the stimulus (Ng and Penney, 2014; Kononowicz et al., 2015). The lCNV component was considered as the average amplitude of the frontocentral ERP within the last 2 s of each interval. The CNV slope as the slope of the fitted line to the ERP in this time window was also calculated.
For the HEP analysis, we first filtered the preprocessed signals using a high-pass filter with a cutoff frequency of 1 Hz. After detecting the exact location of R-peaks in the ECG signal using discrete wavelet transform analysis with the “sym4” wavelets (Khoshnoud et al., 2022), their location was integrated into the EEG signals. Data epochs were then generated time locked from −200 ms before to 600 ms after the detected R-peaks. Baseline correction was performed from −100 to −50 ms. We focused on the average HEP signal over a frontocentral cluster (FC1, Cz, FC2, Fz), as reported in the previous literature (Park et al., 2016; Park and Blanke, 2019; Khoshnoud et al., 2022). Since we aimed to assess the association between the interval length and the HEP signal, we categorized each HEP signal based on the corresponding interval when it occurred (HEP-4s, HEP-8s, and HEP-12s). The HEPs were then averaged for each duration interval and each task. This procedure yielded three grand-averaged HEPs for each task (the reaction-time task, the encoding, and the reproduction phase of the duration-reproduction task). The average HEP amplitudes within time windows with significant differences are referred to as HEP1 and HEP2, respectively. Apart from the average HEP1 and HEP2, we assessed the gradual development of these components over time by calculating the rate of change across the whole target duration (HEP1diff and HEP2diff).
Statistics
The dependent behavioral variables in the duration-reproduction task were (1) reproduced duration, (2) absolute accuracy of duration reproduction, and (3) precision (variance) scores. Whereas the reproduced duration is the standard measure of the duration-reproduction task, i.e., the temporal length of reproduced duration, absolute accuracy is defined as the absolute deviation of the reproduced duration from the presented stimulus duration divided by the presented duration. The smaller absolute accuracy scores indicate more accurate duration reproductions, independent of under- or over-reproduction. The standard deviation and coefficient of variance of the reproduced duration for each interval measured the participants’ precision of estimates. The following statistics were carried out with the statistical software Jasp (version 0.17.2.1, 2023) and with Matlab (R2022b, MathWorks):
One-way repeated-measures ANOVAs for within-subject differences or Friedman tests (for non-normality of the distribution) were calculated for behavioral comparisons across three target durations.
Differences in the frontocentral CNV or HEP signals between tasks or durations were tested using the cluster-based permutation t test (Maris and Oostenveld, 2007) implemented in the FieldTrip toolbox (Oostenveld et al., 2011). With this procedure, samples with a t statistic higher than a threshold (p < 0.05, two-tailed) are clustered in the same set based on temporal adjacency. The sum of the t values within a cluster was assigned as a cluster-level statistic, and the null hypothesis was evaluated using the maximum of these cluster-level statistics. The two-tailed Monte Carlo p value to reject the null hypothesis corresponds to the proportion of cluster-based randomizations that resulted in a larger test statistic than the observed original cluster-level statistic.
We used a 2 × 3 repeated-measures ANOVA test with the Greenhouse–Geisser correction for degrees of freedom to compare the mean CNV and the mean HEP components considering two tasks (the reaction-time task and the encoding phase of the timing task) and three durations (4, 8, and 12 s) as within-subject factors.
Correlations were calculated with Pearson's correlation coefficients. Whenever we found a non-normality of the distribution in one of the two variables, the Spearman correlation coefficient (ρ) was used.
The gradual development of the HEP components was evaluated separately for each interval using the linear mixed-effect-model (LME) statistic. We considered the task (the reaction-time task and the encoding phase of the timing task) and time (the first second up to the last second of the interval) as fixed-effect factors with each participant as a random intercept. F test results were obtained using ANOVA on the fitted model, and whenever the ANOVA results were significant, the pairwise comparisons were calculated.
A final stepwise regression analysis was performed in Matlab using the six independent variables (iCNV and lCNV amplitudes, CNV slope, HEP1diff, HEP2diff, and SAQ) to identify the key factors influencing the reproduced durations. The significance level for entry into the model was set at p < 0.05.
Results
Behavioral findings
Two participants were excluded from further analysis because their data suggested that they were not paying attention to the task and reproduced and reacted randomly (>15% missing responses). Figure 2a represents the distribution of the mean reproduced durations for each interval. The general tendency toward under-reproduction increased with increasing interval duration [M (4 s) = 3.95 s, SD (4 s) = 0.51; M (8 s) = 7.41 s, SD (8 s) = 0.93; M (12 s) = 9.71 s, SD (12 s) = 1.37]. A repeated-measures ANOVA with the Greenhouse–Geisser correction (F(1.19,32.31) = 561.9; p < 0.001; η2 = 0.95) showed that the mean reproduced durations for 4 s intervals were significantly shorter than those for 8 and 12 s intervals (t(27) = −20.01, p < 0.001, Cohen's d = −3.45; t(27) = −33.29, p < 0.001, Cohen's d = −5.75, respectively), and reproductions of 8 s were shorter than those for 12 s intervals (t(27) = −13.28; p < 0.001; Cohen's d = −2.3). This analysis served as a foundational check for the consistency and reliability of the participants’ responses.
Behavioral results. a, Histogram and density curves of the mean reproduced durations for different intervals. b, Boxplot illustrating the accuracy scores across different intervals. c, Boxplot illustrating the absolute accuracy scores across different intervals. Each box represents the interquartile range (IQR) with the line inside indicating the median. The outliers are represented as individual points which are >1.5 * IQR away from the top or bottom of the box.
Regarding the accuracy of reproductions (the deviation of the reproduced duration from the presented stimulus duration divided by the presented duration), significant differences were found between the different interval durations (F(1.24,33.6) = 46.01; p < 0.001; η2 = 0.63). The results (Fig. 2b) indicated that participants under-reproduced the 12 s intervals more than the 8 and 4 s intervals (t(27) = 9.43, p < 0.001, Cohen's d = 1.47; t = 6.22, p < 0.001, Cohen's d = 0.977, respectively). Additionally, under-reproduction was greater in 8 s intervals compared with that in 4 s ones (t(27) = 3.2; p = 0.002; Cohen's d = 0.5). Because large under-reproductions in one trial could average out large over-reproductions in another, we also report the absolute accuracy scores that assess accuracy irrespective of the direction of error.
Due to the non-normal distribution of absolute accuracy, we conducted an analysis of variance using the Friedman test, which showed a significant effect of duration (χ2 = 9.92; df = 2; p = 0.007), with a moderate effect size (Kendall's W = 0.177). While there was no significant difference in absolute accuracy between the 4 and 8 s intervals (z = 2.13; p = 0.07; n = 54) and the 4 and 12 s intervals (z = 0.93; p = 0.35; n = 54), absolute accuracy was significantly higher (less accurate) for the 12 s intervals compared that for 8 s intervals (z = 3.07; p = 01; n = 54; Fig. 2c). We observed a significant effect of interval on the precision of reproduced durations in terms of standard deviation (F(1.04,28.08) = 1,182.08; p < 0.001; η2 = 0.97). The precision of reproduction declined significantly with interval length (larger SD; all p < 0.001). The coefficient of variance of the reproduced durations varied between intervals (F(2,54) = 3.57; p = 0.035; η2 = 0.11) with a significantly lower coefficient of variance for the 8 s intervals compared with the 4 s ones (t(27) = 2.66; p = 0.031; Cohen's d = 0.46). The mean accuracy of the secondary working-memory task during both tasks was relatively high (97.8% during the reaction time and 96.5% during the timing task), showing that participants did attend to the working-memory task, and this controlled for chronometric counting. The performance of the working-memory task did not correlate with the reproduced durations in all three intervals (r(26) = −0.06, p = 0.75; r(26) = −0.14, p = 0.48; and r(26) = −0.03, p = 0.84, respectively).
The accuracy of the heartbeat-counting task was not associated with the three reproduced durations (r(26) = 0.11, p = 0.57; r(26) = 0.01, p = 0.92; and r(26) = 0.01, p = 0.94). We observed positive correlations between the SAQ scores and the reproduced durations. After correcting for three comparisons with the false-discover-rate (FDR) method (Benjamini and Hochberg, 1995), the SAQ score significantly correlated positively with the mean reproduced duration of the 4 s interval (r(26) = 0.40; pfdr = 0.031), 8 s interval (r(26) = 0.46; pfdr = 0.019), and 12 s interval (r(26) = 0.49; pfdr = 0.019). Given that, on average, participants tend to under-reproduce the durations, specifically for 12 s intervals (as illustrated in Fig. 1), this finding suggests that participants with higher SAQ reproduce durations closer to the target duration (less deviation), indicating a more accurate performance. Additionally, negative correlations also emerged between the absolute accuracy values and the SAQ scores, with significance for the 12 s interval (r(26) = −0.46; pfdr = 0.042). The larger the SAQ score, the smaller the deviation from the actual stimulus duration indicating higher accuracy in the time reproduction task.
Duration encoding phase versus reaction time
Figure 3, a–c, presents the grand-average ERPs time locked to the first sound in the reaction-time task and the encoding phase of the duration-reproduction task. The cluster-based permutation tests revealed that the observed CNV was significantly larger (more negative) within the 1,000–4,000 ms time interval for 4 s, 2,100–5,500 ms for the 8 s, and 3,400–5,300 ms for the 12 s duration (p < 0.05) during the timing task compared with the control task. The scalp topographies in these time intervals showed larger frontocentral negativity for the encoding phase compared with the reaction-time task. Interestingly, the significant time windows were around the offset of the shortest target duration (4 s) when the brain predicts a likely offset.
Distinct CNV for the encoding phase of the duration-reproduction task compared with the reaction-time task. The top plots (a–c) show the grand-average ERPs from the frontocentral scalp electrodes time locked to the first sound in the reaction-time task (blue) and the encoding phase of the duration-reproduction task (orange) for the 4 s (a), 8 s (b), and 12 s (c) intervals. The error shade represents the standard error, and the gray traces represent the interval portion beyond the target interval. Black traces in the time axes show the time interval in which the difference between two ERPs was significant according to the cluster-based permutation test. The bottom plots (a–c) present the corresponding topography for the ERPs within these intervals (corresponding to the phases of the reaction-time task and duration encoding, respectively). d, The average iCNV amplitude with corresponding standard errors for the reaction-time task and the encoding phase of the duration-reproduction task. e, The average lCNV amplitude with corresponding standard errors for both tasks. f, The average CNV slopes for both tasks and three intervals.
Repeated-measures ANOVA showed that the iCNV amplitude was significantly larger for the encoding phase of the duration reproduction than the identical stimulus-presentation phase in the reaction-time task (F(1,27) = 9.96; p = 0.004; η2 = 0.16; Fig. 3d). This indicates a neural-processing difference between these two tasks. After the onset of the first sound in the encoding phase, participants probably oriented their attention toward time and started storing temporal information in the working memory (to be able to reproduce the interval later) while waiting to press the button as soon as the sound ended (explicit timing). In the reaction-time task, they likely waited for the sound to finish and then pressed the button. No duration effects or interactions of duration and task were observed.
The main effects of task and duration for the lCNV amplitude were significant (F(1,27) = 10.21, p = 0.004, η2 = 0.089; F(2,54) = 5.01, p = 0.01, η2 = 0.058). As shown in Figure 3e, the lCNV amplitude was significantly larger (more negative) for the encoding phase than for the reaction-time phase. Regardless of the task, the lCNV amplitude was more negative for the 4 s intervals than that for the 8 and 12 s intervals (t(27) = −2.88, p = 0.01, Cohen's d = 0.017 and t(27) = −2.57, p = 0.02, Cohen's d = 0.025, respectively). Corresponding with the previous literature, larger lCNV amplitudes were associated with shorter durations (Kononowicz et al., 2015; Robinson and Wiener, 2021). As for the slope of the CNV signal (Fig. 3f), we observed a significant main effect of duration (F(2,54) = 10.08; p < 0.001; η2 = 0.125). The slope of the CNV was significantly smaller for 12 s intervals compared with the 4 s (t(27) = 4.3; p < 0.001; Cohen's d = 0.84) and 8 s (t(27) = 3.2; p = 0.004; Cohen's d = 0.63) intervals.
Duration-reproduction phase versus reaction time
To compare the reproduction phase of the timing task with the reaction-time task, the ERPs were extracted in both tasks time locked to the motor response (button press in the reaction-time task and button press in the reproduction phase of the timing task). The extracted grand-average ERPs are presented in Figure 4. Overall, the CNV signal before pressing the button during the duration-reproduction phase was larger (more negative) compared with the control task with significantly higher negative values for the 8 s interval (cluster-based permutation test, p < 0.05).
Distinct CNV signals for the reproduction phase of the duration-reproduction task compared with the reaction-time task. The grand-average frontocentral ERPs time-locked to the button press in both the duration-reproduction phase (red trace) and the reaction-time phase (blue trace).
We assessed the initial and late CNV amplitudes during the duration-reproduction phase. For this, the EEG signals were epoched time locked to the reproduction cue (onset of the second sound) within 1 s before to 12 s afterward, and the iCNV, the lCNV, and slope of the CNV were calculated (Fig. 5a). Considering the iCNV and lCNV amplitudes, the main effect of duration for both measures using a linear contrast was significant (F(2,54) = 2.90, p = 0.023, η2 = 0.097; F(1.58,42.66) = 3.14, p = 0.036, η2 = 0.10). However, the following post hoc tests failed to reveal a significant difference for iCNV and lCNV between three durations. For the CNV slope, the main effect of duration was significant (F(1.35,36.49) = 8.42; p = 0.003; η2 = 0.23), highlighting significantly steeper slopes for the 4 s interval compared with the 8 and 12 s interval reproductions (Fig. 5d; t(27) = 3.12, p = 0.006, Cohen's d = 0.69 and t(27) = 3.86, p < 0.001, Cohen's d = 0.85, respectively). We assessed possible associations between the extracted CNV components and the reproduced durations. The reproduced durations showed a significant negative correlation with the mean lCNV amplitude (r(26) = −0.46; p = 0.014) only for the 12 s interval (Fig. 5e). There was no correlation between the slope of the CNV and mean reproduced durations.
Characteristics of the CNV signals during the reproduction phase. a, The grand-average frontocentral ERPs time locked to the duration-reproduction cue (second sound). b, Mean iCNV amplitude with corresponding standard errors for different durations of duration-reproduction phase. c, Mean lCNV amplitude with corresponding standard errors for different durations of the duration-reproduction phase. d, The average CNV slope for the three different durations of the duration-reproduction phase. e, The association between mean reproduced durations and the mean lCNV amplitudes during the duration-reproduction phase.
Modulation of the heartbeat-evoked potential during duration estimation
For each subject, the HEPs were averaged based on the time interval that occurred (HEP-4s, HEP-8s, and HEP-12s) for each task (the reaction-time task and the duration-encoding and duration-reproduction phase of the timing task). These grand-averaged HEPs were then compared across time intervals within each task using the cluster-based permutation test. The duration-encoding phase showed significantly different HEP values for the three durations within [130–270 ms] and [470–520 ms] after the R-peak (p < 0.05, cluster-based permutation test). The grand-average frontocentral HEPs for the encoding phase and significant time windows are presented in Figure 6. The HEP values around these time windows were significantly smaller for the 4 s intervals than the 8 and 12 s intervals. Given that no differences were observed for the reaction time or the reproduction phase of the duration-reproduction task, this highlights a duration encoding-specific function for the HEP amplitude within these time windows.
The HEPs during the encoding phase. The left panel shows the grand-average HEPs correspond to the three target intervals in the encoding phase of the duration-reproduction task. Highlighted time windows show significantly different HEPs among the durations. The average topographies within the first time window (HEP1) and the second time windows (HEP2) are presented in the right panel.
The average scalp topographies over the first- and second-time windows revealed less negative frontocentral HEP values for the 4 s-encoding intervals compared with the other two intervals (Fig. 6). The average HEP in the first and the second time window are referred to as HEP1 and HEP2, respectively.
Exploratory analysis
Association of HEP components with the reproduced durations
For 8 s intervals, the HEP2 amplitude during the reproduction phase was significantly inversely correlated with the reproduced durations (r(26) = −0.43; p = 0.02), indicating that higher HEP2 amplitudes were associated with shorter reproduced durations. Although weaker negative correlations were observed for the 4 s (r(26) = −0.29; p = 0.12) and 12 s (r(26) = −0.34; p = 0.07) intervals, these were not statistically significant. We conducted a further analysis using Fisher's r-to-z transformation to test whether the correlations for the 8 s intervals were significantly different from those observed for the 4 and 12 s intervals. The results of this analysis indicated that the correlation for the 8 s interval was not significantly different from those for the 4 s (z = −0.37; p = 0.70) and 12 s (z = −0.57; p = 0.56) intervals. Similarly, a significant positive correlation was found between HEP2 amplitude and absolute accuracy for 8 s intervals (r(26) = 0.46; p = 0.01) and 4 s intervals (r(26) = 0.46; p = 0.01) but not for 12 s intervals (r(26) = 0.32; p = 0.09), suggesting that for these intervals, increased HEP2 amplitude is linked to less accurate timing.
Gradual dynamics of the HEP components: duration encoding phase versus reaction time
Second-by-second dynamics of the HEP components (HEP1 and HEP2) for both tasks (the reaction-time task and encoding phase) separately for each time (4, 8, or 12 s) are presented in Figure 7. We compared the HEP1 and HEP2 values between the two tasks (task effect) for each second (time effect) using the LME models. For the HEP1 values, a significant main effect of time (F(3,189) = 4.53; p = 0.004) was observed for the 4 s intervals. Subsequent post hoc tests showed that the HEP1 values decreased significantly from the first second up to the third second of the interval only for the encoding phase of the timing task (β = 0.41, z = 2.42, p = 0.027; β = 0.55, z = 3.22, p = 0.003). While the HEP1 values decreased second by second, the HEP2 values showed a rising trend for all three intervals. The LME model revealed a main task effect for the 8 s interval (F(1,405) = 7.15; p = 0.008), reflecting more negative HEP2 amplitudes for the duration-encoding phase than the reaction time.
Development of the HEP components during the encoding phase compared with the reaction-time task. Second-by-second HEP1 (left column) and HEP2 (right column) amplitudes for (a) 4 s interval, (b) 8 s interval, and (c) 12 s interval during the reaction-time task and the duration-encoding phase. All significant differences are demonstrated with black and red lines. The red lines show that the corresponding difference was only significant in the encoding phase of the duration-reproduction task.
A main effect of time was observed (F(11,621) = 1.98; p = 0.027) for the 12 s intervals, showing higher HEP2 values for the third, fifth, sixth, and seventh seconds compared with the first second, but only for the encoding phase of the timing task (β = −0.68, z = −3.3, p = 0.004; β = −0.68, z = −3.31, p = 0.004; β = −0.60, z = −2.9, p = 0.007; β = −0.79, z = −3.86, p < 0.001, respectively). The slope of this gradual development for HEP1 and HEP2 showed no significant association with the reproduced durations. Similarly, the rate of change [difference between the first and last second HEP1 (HEP1diff) or HEP2 (HEP2diff)] showed no correlation with the reproduced durations.
The cumulative HEP2 trend for each interval and each task was examined (Fig. 8) to better understand the gradual change of the HEP2 component within each interval. Starting from the first second of the interval, an apparent ramp-like increase in the cumulative HEP2 amplitude could be observed during the 8 and 12 s intervals for the encoding phase of the timing task, but not for the reaction-time task. We compared the cumulative HEP2 values between the two tasks separately for each interval using the LME models with task (reaction time or duration encoding) and time (first second up to last second of the interval) as fixed-effect factors. The LME model showed significant effects of the cumulative HEP2 for the 8 and 12 s intervals. The cumulative HEP2 amplitude was significantly more negative for the encoding phase of the duration-reproduction task compared with the reaction-time task (F(1,405) = 50.67; p < 0.001) for the whole 8 s interval. No significant effect of time was observed for the 8 s interval, indicating that the HEP2 amplitude does not differ as a function of time, although a trend is visible (Fig. 8b).
The cumulative HEP2 trend. The cumulative HEP2 amplitudes for (a) 4 s interval, (b) 8 s interval, and (c) 12 s interval during the encoding phase of the duration-reproduction and the reaction-time task. The solid lines show significant effects for the 8 and 12 s intervals.
Considering the cumulative HEP2 values within the 12 s interval, the main effect of time was significant (F(11,621) = 3.38; p < 0.001). Post hoc tests showed that the cumulative HEP2 values decreased/changed significantly from the third second up to the end of the interval for the encoding phase of the reproduction task (all p < 0.005).
Gradual dynamics of the HEP components: duration-reproduction phase
We finally looked at the second-by-second development of the HEP components for the duration-reproduction phase. A similar gradual increase in the HEP2 was observed for all three intervals (Fig. 9). The LME model did not show any significant main effect of time for each of the three intervals. Correlation analysis showed a significant association between the amount of change in HEP2 in the reproduction phase and reproduced durations for the 8 and 12 s intervals (r(26) = −0.38, p = 0.04; r(26) = −0.55, p = 0.002). The larger the HEP2 differences between the first and last seconds of the intervals (HEP2diff), the shorter the reproduced durations (Fig. 9d).
Development of the HEP components during the reproduction phase. Second-by-second development of the HEP1 and HEP2 amplitudes for the duration-reproduction phase during the (a) 4 s, (b) 8 s, and (c) 12 s intervals. d, The average change in the HEP2 amplitude between the first and last seconds during the reproduction phase correlated negatively with the reproduced duration in the 8 and 12 s intervals.
Neural indices predict duration reproduction
We performed a stepwise regression analysis to identify the significant neural indices that predict temporal reproductions in each target duration. We first fitted the model for each target interval with one independent variable (SAQ scores) and added other variables (lCNV amplitudes, CNV slope, HEP1diff, and HEP2diff) as possible predictors.
The significance level for entry into the model was set at p < 0.05. Table 1 shows the final stepwise regression models for each interval-reproduction category along with the corresponding coefficients and associated statistics for the variables. The final model for the 4 s-interval reproductions only included the SAQ scores and was statistically significant (F(26) = 5.21; p = 0.030), explaining 16.7% of the variance in the reproduced durations. The 8 s-interval reproductions (the final model, which included the HEP2diff and SAQ scores as predictors) explained 43% of the variance in the reproduced durations (F(25) = 9.43; p < 0.001). Considering 12 s intervals, the final model included the lCNV amplitude, HEP2diff, and SAQ scores as predictors and explained 54.2% of the variance in the reproduced durations (F(24) = 9.46; p < 0.001). The results of the stepwise regression analysis suggest that HEP2diff and SAQ scores were associated with the reproduced durations in the 8 and 12 s intervals.
Results of the stepwise regression analysis
Discussion
This study aimed to assess the association between time perception and interoceptive processes. To address this, we asked participants to perform a duration-reproduction task and a reaction-time task with 4, 8, and 12 s intervals. We evaluated participants’ responses to interoceptive cues through the HCT with adapted instructions and with the interoceptive Self-Assessment Questionnaire (SAQ). Additionally, we analyzed HEPs to further investigate the neural underpinnings of this association. Our behavioral findings confirmed previous studies (Eisler, 1976; Noulhiane et al., 2009; Wittmann et al., 2010) by showing that participants progressively under-reproduced intervals in relation to the target-interval length. A significant drop in reproduction accuracy was observed between the 4 s and the 12 s intervals and between the 8 s and the 12 s intervals. The longer the duration, the harder it was to retain its representation in working memory and reproduce it accurately.
The interoceptive awareness score (assessed by the SAQ) correlated positively with the mean reproduced durations for all intervals. Similar significant correlations were also observed for the absolute accuracy of reproductions of the 12 s interval and the SAQ. The higher the SAQ score (a stronger awareness of bodily signals), the longer the duration reproductions and the more accurate the timing behavior. This association aligns well with the previous work showing that interoceptive awareness predicts timing accuracy in an irregular, but not in a regular, externally cued condition (Teghil et al., 2020a). As no regularly occurring external cues were present in this experiment (and no counting or rhythm-based strategy was used, as evidenced by the accurate performance of the secondary working-memory task), participants had to rely on interoception of body signals for timing intervals. However, this study failed to find any association between the HCT performance and the reproduced durations across all intervals, which stands in contrast to previous positive findings (Meissner and Wittmann, 2011; Richter and Ibáñez, 2021). Regarding the elaborate critique of Desmedt et al. (2018, 2020) concerning this type of heartbeat-detection task, other types of tasks could be used in the future. For example, in one task, participants are required to tap a key in synchrony with their heartbeats which assesses heartbeat detection ability more objectively (Garfinkel et al., 2022). This stands in accordance with the criticism of Desmedt et al. (2018, 2020), who through systematic variation were able to show that with the original HTC task individuals did not actually rely so much on interoceptive (heartbeat) cues but on time- and knowledge-based estimations.
These outcomes relating to a partial association between interoception and subjective time are complemented by the results derived from the HEP analysis. The HEP is an ERP component of the heartbeat that covaries with bodily self-consciousness (Montoya et al., 1993; Park et al., 2016). Specifically, the HEP amplitude in the 200–600 ms window after the R-peak is an index of cortical processing of afferent cardiac signals. We found a duration-specific modulation of this HEP signal only during the encoding phase of the duration-reproduction task. The HEP amplitude within interval borders [130–270 ms] and [470–520 ms] after the R-peak was smaller for the 4 s interval than the 8 and 12 s intervals. We named the average HEP within the first and second time windows HEP1 and HEP2, respectively. Given that no differences between intervals were observed in the reaction-time task, these results highlight a duration-specific function of the HEP signal. The HEP amplitude during these time windows increased with increasing time intervals in the encoding phase, which corresponds well with the results of the study by Richter and Ibáñez (2021). They demonstrated that the HEP amplitude was significantly higher within the borders of 182–222 ms after the R-peak in those participants who overestimated duration compared with the under-estimator subgroup. We observed a negative correlation between the HEP2 value during the reproduction phase of the timing task and the reproduced durations for the 8 s intervals. The larger the HEP2 amplitude, the shorter the reproduced interval.
The novelty of our study lies in our assessment of the gradual development of the HEP components in each task. The gradual development of the HEP2 amplitude presented in Figures 7 and 8 (cumulative as well as second-by-second development) resemble the previously reported ramp-like increases in cardiac interbeat intervals (Meissner and Wittmann, 2011) or the climbing activity in the insular cortex as detected in fMRI (Wittmann et al., 2010) during the encoding and the reproduction phases of the timing task. This ramp-like increase was significant for the duration-encoding phase of the timing task and not for the reaction-time task particularly for 12 s intervals. While participants were trying to estimate the 12 s interval, the HEP2 amplitude (especially the cumulative HEP2) gradually increased, showing a higher HEP2 after the third second up to the end of the interval compared with the first 3 s. These observations are consistent with the notion of the cortical accumulation of heartbeats as the neural mechanism of suprasecond timing (Wittmann, 2013). We also observed indications of a reset of the climbing HEP activity as a function of timing, which was most clearly visible in the 8 s-encoding condition (Fig. 9b). Climbing HEP amplitudes decreased just before the expected end of the stimulus, i.e., between 3 and 4 s, and then again between 7 and 8 s. No further climbing activity was observed for the encoding period between 8 and 12 s, perhaps because the brain could predict the longest interval's duration immediately once the sound did not stop at 8 s. These preliminary findings and speculative interpretations should be studied further in follow-up studies.
Notably, the amount of HEP2 increases within the reproduction phase of the duration-reproduction task significantly correlated with the reproduced durations for the 8 and 12 s intervals. The larger the increase in the HEP2 amplitude, the faster the subjective time and the greater the under-reproduction of duration. This is a striking correlation between the HEP2 as a measure of cortical response to heartbeat and behavioral timing.
Regarding the CNV amplitude, which has been suggested to represent a time-based response in decision-making process (Pfeuty et al., 2005; Casini and Vidal, 2011; van Rijn et al., 2014; Kononowicz et al., 2015), the timing task (the encoding phase and reproduction phase) led to significantly more negative CNV values (as tested by the cluster-based permutation test) compared with the control task. The average iCNV and lCNV components were significantly larger (more negative) for the encoding phase of the duration-reproduction task than for the reaction-time task. Both components (the iCNV and lCNV) have been linked to orienting attention and early anticipation processes (Kononowicz et al., 2015; Robinson and Wiener, 2021). Given the link between these CNV components and attention modulation, we interpret these findings as a sign of stronger sustained attention toward time during the timing task. While no significant differences in the iCNV component were found for different durations, the lCNV amplitude during the encoding phase covaried with the duration of the interval, reflecting the previously reported larger lCNV amplitudes for shorter durations (Kononowicz et al., 2015; Robinson and Wiener, 2021). A similar covariation was observed for the slope of the CNV during both encoding and reproduction phases, reflecting that shorter durations were associated with the larger CNV slope. The CNV slope was significantly higher for the encoding phase of the duration-reproduction task than for the reaction time, especially during 8 s intervals.
Building on the aforementioned conceptual and empirical work, which the present results of our study complement, one can conclude that the passage of time is constituted through the existence of the bodily self across time as an enduring and embodied entity (Craig, 2009b), an idea that was voiced by Merleau-Ponty (1945) in his phenomenological analysis: the physical self and subjective time are inseparable. The cognitive pacemaker-accumulator model with its abstract “pulses” can now be complemented with concrete entities: the neural signals from the body as subjectively experienced (as assessed with the Self-Awareness Questionnaire) and objectively measured by the HEP recorded 470–520 ms after the R-peak. Accordingly, the two modulators of subjective time (attention and arousal) in the pacemaker-accumulator model can be understood as regulating the inflow of signals from the body, potentially as they accumulate over time in the insular cortex. A subjective expansion of duration is achieved through an increased awareness of bodily states or an increase in the amplitude of the HEP, either through more attention to the bodily self or through increased physical arousal.
The present study has some limitations that should be acknowledged. First, we restricted trial numbers to 126 (42 per target duration) for practicality, which potentially limited the statistical power of our analysis, particularly for the analysis of HEP dynamics across time. Next, we employed a secondary working-memory task to prevent the confounding effects of using a counting or a rhythm-based strategy for time estimation. Adding an additional, ecologically valid task without this constraint might have been beneficial. As discussed above, employing adapted instructions and confidence ratings in the HCT may reduce participants’ reliance on time and knowledge-based estimation but cannot completely control for it. Future studies should therefore use a more comprehensive approach to assess interoceptive accuracy, incorporating measures beyond HCT, such as behavioral tests tailored to capture distinct aspects of interoception without relying on subjective time and knowledge estimation strategies (Desmedt et al., 2023). This work explored the function between CNV, HEP, and temporal reproductions within the suprasecond range (4, 8, and 12 s). Future studies should consider a broader range of intervals to ascertain the HEP's relevance across different time scales. Longer intervals may invoke distinct cognitive strategies beyond the HEP (e.g., memory load) or slower interoceptive signals, such as the breathing rate.
Timing mechanisms in the brain are of the essence for an organism to represent environmental temporal regularities and the temporal metrics of events (Buhusi and Meck, 2005). Based on conceptual considerations from functional neuroanatomy and from experimental psychophysiology, we suggest that interoceptive (bodily) states create the experience of time. We provide for the first time strong electrophysiological evidence that the brain processes signal components from the ascending heartbeats while tracking time. This finding amounts to a neurophysiological mechanism that determines the estimation of duration. Within the theoretical framework of the embodiment of time, we presented strong evidence of how the heart and associated brain networks process time. We postulate that the ongoing creation of an embodied self over time by ascending neural signals from the body—forming the material “me” (Craig, 2009b)—could function as a measure of time.
Data Availability
The complete data supporting this study's findings are available for researchers from the first author upon reasonable request. The example dataset and the code used for the analysis will be made available on the Open Science Framework website (www.osf.io).
Footnotes
This study was funded by the EU, Horizon 2020 Framework Program, FET Proactive (VIRTUALTIMES consortium, grant agreement Id: 824128 to M.W.). VIRTUALTIMES—exploring and modifying the sense of time in virtual environments—includes the following groups with the principal investigators Kai Vogeley (Cologne), Marc Wittmann (Freiburg), Anne Giersch (Strasbourg), Marc Erich Latoschik, Jean-Luc Lugrin (Würzburg), Giulio Jacucci, Niklas Ravaja (Helsinki), Xavier Palomer, and Xavier Oromi (Barcelona).
The authors declare no competing financial interests.
- Correspondence should be addressed to Shiva Khoshnoud at shiva.khoshnoud{at}uni-tuebingen.de.