Estimating the errors on measured entropy and mutual information

doi:10.1016/S0167-2789(98)00269-3

Physica D: Nonlinear Phenomena

Volume 125, Issues 3–4, 15 January 1999, Pages 285-294

https://doi.org/10.1016/S0167-2789(98)00269-3 Get rights and content

Abstract

Information entropy and the related quantity mutual information are used extensively as measures of complexity and to identify nonlinearity in dynamical systems. Expressions for the probability distribution of entropies and mutual informations calculated from finite amounts of data exist in the literature but the expressions have seldom been used in the field of nonlinear dynamics. In this paper formulae for estimating the errors on observed information entropies and mutual informations are derived using the standard error analysis familiar to physicists. Their validity is demonstrated by numerical experiment. For illustration the formulae are then used to evaluate the errors on the time-lagged mutual information of the logistic map.

Introduction

Information theoretic functionals such as entropy and the related quantity of mutual information can be used to identify general relationships between variables. Information entropy has been used to analyze the behavior of nonlinear dynamical systems and time series [1], [2], [3], [4], [5], [6], [7], [8], [9], [10]. Information entropy is also used to quantify the complexity of symbol sequences such as DNA sequences [11]. Invariably the analysis of real data involves a finite amount of data. Furthermore, in the case of continuous variables, a quantization must be chosen. The calculated entropy of the data will have a functional dependence on the amount of data and the quantization chosen. To know the significance of the calculated entropy the effect of finite data and quantization on the probability distribution of the calculated entropy should be known. Expressions for the systematic and random errors in observed entropies have been calculated before by Basharin [12], Harris [13] and Herzel et al. [14] but such expressions have rarely been used in the nonlinear dynamics literature.

In this paper error estimates will be derived using the standard error formulae familiar to physicists. The formulae are presented in concise form in Section 6 of this paper.

These formulae will be verified by numerical experiments before being applied to the well-known logistic equation. This will demonstrate that when small datasets are being analyzed the bias and random error in mutual information can be significant and should be estimated.

Section snippets

Entropy and mutual information

The most common information theoretic functional is entropy. For a discrete variable, X, the entropy is defined as $H(X)=− ∑ i=1 B_{X} p_{i} ln p_{i},$ where the sum is over the B_X “states” that X can assume and p_i is the probability that X will be in state i. The joint entropy of two discrete variables, X and Y, is defined as $H(X,Y)=− ∑ i=1 B_{X} ∑ j=1 B_{Y} p_{ij} ln p_{ij},$ where the sum is over the B_X states that X can assume and the B_Y states that Y can assume and p_ij is the probability that X is in state i and Y is in state j.

Estimating the error on an observed entropy

In this section the systematic and random error of an observed entropy of a series of values will be estimated.

Consider an ensemble of series. Let there be N values in each series. Let each value be assigned to one of B states, which will be labeled i (i=1,2,…,B). Let the probability that a value will be in the ith state be p_i. Let the number of values in the ith state be n_i. The number of values in the ith state, n_i, is a binomial random variable. This can be seen by considering each member of

Estimating the error on an observed mutual information

The error analysis of an observed mutual information can be performed in a similar manner to that of an observed entropy.

The observed mutual information, I_obs, is given by $I_{obs} =H_{obs} (X)+H_{obs} (Y)−H_{obs} (X,Y).$ If X can assume one of B_X states and Y can assume one of B_Y states and if there are N pairs (X,Y) then the expectation value of I_obs is given by $〈I_{obs} 〉=H_{∞} (X)− B^{*}_{X} −1 2N +H_{∞} (Y)− B^{*}_{Y} −1 2N −H_{∞} (X,Y)+ B^{*}_{XY} −1 2N,$ $〈I_{obs} 〉=I_{∞} + B^{*}_{XY} −B^{*}_{X} −B^{*}_{Y} +1 2N,$ where I_∞ is the “true” mutual information which would be measured when N

Application to the logistic equation

To illustrate how the error estimates can be used they were applied to datasets generated using the famous logistic equation $x_{t+1} =4x_{t} (1−x_{t}).$ Time series of $N=10 000$ , N=5000, N=500 and N=200 data points were generated. The points lay on the real interval [0,1] but were binned into 10 bins, each of width 0.1. The mutual information of each time series and a lagged version of itself was then calculated. The results are shown in Fig. 3. In each panel the solid line shows the observed mutual

Summary

Estimates of the systematic and standard error on observed entropies and mutual informations have been derived. The result for entropy is $H_{∞} ≈H_{obs} + B^{*} −1 2N ±σ_{H},$ $σ_{H} = 1 N ∑ k=1 B (ln q_{k} +H_{obs})^{2} q_{k} (1−q_{k}),$ where q_i is the observed distribution of states and B^* is the number of bins for which q_i≠0. The result for mutual information is $I_{∞} ≈I_{obs} + B^{*}_{X} +B^{*}_{Y} −B^{*}_{XY} −1 2N ±σ_{I},$ $σ_{I} = 1 N ∑ k=1 B_{X} ∑ l=1 B_{Y} (ln q^{X}_{k} + ln q^{Y}_{l} − ln q_{kl} +I_{obs})^{2} q_{kl} (1−q_{kl}),$ where q^X and q^Y are the observed distributions of X and Y respectively, that is $q^{X}_{k} = ∑ j=1 B_{Y} q_{kj}, q^{Y}_{l} = ∑ i=1 B_{X} q$

Acknowledgements

The author would like to thank Hans-Peter Herzel for drawing his attention to some of the previous work in this field and to the two anonymous reviewers whose suggestions greatly improved this paper.

References (18)

M. Palus
Singular-value decomposition in attractor reconstruction – pitfalls and precautions
Physica D
(1992)
M. Palus
Information theoretic test for nonlinearity in time-series
Phys. Lett. A
(1993)
M. Palus
Testing for nonlinearity in weather records
Phys. Lett. A
(1994)
M. Palus
Testing for nonlinearity using redundancies – quantitative and qualitative aspects
Physica D
(1995)
M. Palus
Detecting nonlinearity in multivariate time-series
Phys. Lett. A
(1996)
M. Palus
Coarse-grained entropy rates for characterization of complex time-series
Physica D
(1996)
Y.-C. Tian et al.
Extraction of delay information from chaotic time series based on information entropy
Physica D
(1997)
M.S. Roulston
Significance testing of information theoretic functionals
Physica D
(1997)
H. Herzel et al.
Finite sample effects in sequence analysis
Chaos, Solitons and Fractals
(1994)

There are more references available in the full text version of this article.

Cited by (238)

Resolving heterogeneity in dynamics of synchronization stability within the salience network in autism spectrum disorder
2024, Progress in Neuro-Psychopharmacology and Biological Psychiatry
Heterogeneity in resting-state functional connectivity (FC) are one of the characteristics of autism spectrum disorder (ASD). Traditional resting-state FC primarily focuses on linear correlations, ignoring the nonlinear properties involved in synchronization between networks or brain regions.
In the present study, the cross-recurrence quantification analysis, a nonlinear method based on dynamical systems, was utilized to quantify the synchronization stability between brain regions within the salience network (SN) of ASD. Using the resting-state functional magnetic resonance imaging data of 207 children (ASD/typically-developing controls (TC): 105/102) in Autism Brain Imaging Data Exchange database, we analyzed the laminarity and trapping time differences of the synchronization stability between the ASD subtype derived by a K-means clustering analysis and the TC group, and examined the relationship between synchronization stability and the severity of clinical symptoms of the ASD subtypes.
Based on the synchronization stability within the SN of ASD, we identified two subtypes that showed opposite changes in synchronization stability relative to the TC group. In addition, the synchronization stability of ASD subtypes 1 and 2 can predict the social interaction and communication impairments, respectively.
These findings reveal that ASD subgroups with different patterns of synchronization stability within the SN appear distinct clinical symptoms, and highlight the importance of exploring the potential neural mechanism of ASD from a nonlinear perspective.
Substrate induced dynamical remodeling of the binding pocket generates GTPase specificity in DOCK family of guanine nucleotide exchange factors
2022, Biochemical and Biophysical Research Communications
Citation Excerpt :
To understand the dynamical coupling between GEF and GTPase structures, we calculated the canonical residue-wise dynamical cross-correlation (DCC). Ten clusters of residues, grouped as independent components (ICs), with varying coupling hierarchy, were obtained by decomposing the DCC matrix using the statistical coupling analysis method [27] (Fig. S4). Surprisingly, there were no clusters that had residues from both GTPases and GEFs.
Dedicator of cytokinesis (DOCK) family of guanine nucleotide exchange factors (GEFs) activate two members of Rho family GTPases, Rac1/Cdc42, to exert diverse cellular processes, including cell migration. As DOCK GEFs have been critically implicated in tumour cell migration, understanding their function and specificity is imperative for designing anti-metastatic drugs. Based on their GTPase specificity they have been classified as Rac, Cdc42 and dual specific GEFs. Despite extensive structural studies, the factors that determine GTPase specificity of DOCK GEFs have remained elusive. Here, we show that subtle dynamical coupling between GEF and GTPase structures modulate the binding interface to generate mutual specificity. To cluster the dynamically coupled residues in GEF-GTPase complexes a novel intra-residue backbone-torsion-angles based mutual information (TMI) technique was employed. TMI was calculated from 4500 trajectories obtained from a total of 4.5μs molecular dynamics simulations performed on members of all the three clades of DOCK GEFs. The obtained clusters suggest a specificity generation mechanism that involves optimization of the binding pocket for the crucial divergent residue at the 56^th position of Rac/Cdc42 (FCdc42/WRac1). These clusters encompass five residues from the structural segment lobe C - α10 helix of the DOCK proteins and functional SWI region of GTPase, which induce orchestrated structural modulations to generate the specificity. Even the conserved residues from SWI region are seen to augment the specificity defining dynamical rearrangements. Furthermore, the proposed dynamical GTPase- DOCK GEF specificity model was verified using mutagenesis studies on Rac1 and dual GTPase specific Dock2 and Dock6, respectively. Thus the current study provides the generic substrate specificity determinants of DOCK GEFs, which are not apparent from the conventional structural analysis.
Structural damage identification under nonstationary excitations through recurrence plot and multi-label convolutional neural network
2021, Measurement: Journal of the International Measurement Confederation
Civil engineering structures inevitably suffer from nonstationary ambient excitations in practice, which make conventional damage identification methods relying on the stationary assumption ineffective. This study presents a novel method based on unthresholded assembled recurrence distance matrix (UARDM) and multi-label convolutional neural network (CNN) for structural damage identification under nonstationary excitations. UARDM is a new type of recurrence plot (RP) that is proposed to integrate information of multiple channels and dispense with the artificially selected threshold. It reveals intrinsic dynamic characteristics of the structure using its vibration responses from the perspective of global probabilistic autocorrelation. After that, CNN is applied to automatically extract damage-sensitive features of UARDMs and classify them for the identification of damage cases. Instead of the traditional single-label CNN model that labels each combination of damage location and level as an objective class, the multi-label CNN model is developed to decouple the identification processes of damage locations and levels in order to improve the identification accuracy and computational efficiency. It evaluates the damage level at each location through a sub-branch with an independent set of labels and detects the damage locations by fusing information of all the sub-branches. A comprehensive comparison was conducted among single-label and multi-label CNN models input with raw accelerations, unthresholded multivariate recurrence plots (UMRPs), unthresholded recurrence plots (URPs) and UARDMs through numerical simulation and experimental test. It was demonstrated that the proposed structural damage identification method based on UARDM and multi-label CNN was able to identify multiple damage locations and levels under various stationary and nonstationary excitations with higher accuracy, efficiency and robustness, and even able to detect multiple-damage cases that were not measured beforehand and involved in the training dataset.
Depression recognition based on the reconstruction of phase space of EEG signals and geometrical features
2021, Applied Acoustics
Depression is a mental disorder that continues to make life difficult or impossible for a depressed person and, if left untreated, can lead to dangerous activities such as self-harm and suicide. Nowadays, Electroencephalogram (EEG) has become an important diagnostic tool for many brain disorders. In this article, a new method for the detection of depression based on the reconstructed phase space (RPS) of EEG signals and geometrical features has been proposed. The RPS of the EEG signals of 22 normal and 22 depressed subjects are plotted in two-dimensional space and, based on their shape, 34 geometrical features are extracted. The p-values for the proposed features were significantly lower (p-value $\approx$ 0) indicating the capacity of the proposed geometric features for the normal and depression EEG signals classification tasks. For the purpose of reducing feature vector arrays, the performance of four optimization algorithms is checked, namely: ant colony optimization (ACO), grey wolf optimization (GWO), genetic algorithm (GA) and particle swarm optimization (PSO), in which GA with the ability of 58.8% was better than the other optimization algorithms for decreasing the feature vector arrays. Selected features are fed to the support vector machine (SVM) classifier with radial basis function (RBF) kernel and K-nearest neighbors (KNN) classifier with Euclidean and city block distances in 10-fold cross-validation (CV) strategy. The proposed framework achieved a fairly good average classification accuracy (ACC) of 99.30% and a Matthews correlation coefficient (MCC) of 0.98 using the selected features of the PSO algorithm and the SVM classifier. We found that the RPS of normal EEG signals has a more irregular, complex and unpredictable shape than the RPS of depression EEG signals which has more regular (simple) with less variation and more predictable shape; therefore, we can say that RPS of EEG signals can be used as a biomarker for psychiatrists which are simpler than the EEG signals in visual depression diagnostics. We also found that EEG signals from the right hemisphere are significant for depression detection than the left hemisphere. The proposed framework may be used in clinics and hospitals to detect depression disorder quickly and precisely.
Efficiently Estimating Mutual Information Between Attributes Across Tables
2024, arXiv
Measuring Approximate Functional Dependencies: a Comparative Study
2023, arXiv

View all citing articles on Scopus

^☆: Contribution number 5759, California Institute of Technology Division of Geological and Planetary Sciences.

View full text

Estimating the errors on measured entropy and mutual information☆

Abstract

Introduction

Section snippets

Entropy and mutual information

Estimating the error on an observed entropy

Estimating the error on an observed mutual information

Application to the logistic equation

Summary

Acknowledgements

Physica D

Phys. Lett. A

Phys. Lett. A

Physica D

Phys. Lett. A

Physica D

Physica D

Physica D

Chaos, Solitons and Fractals