 |
Previous Article | Next Article 
The Journal of Neuroscience, September 15, 1998, 18(18):7474-7486
Low Doses of Ethanol Reduce Evidence for Nonlinear Structure in
Brain Activity
Cindy L.
Ehlers1,
James
Havstad,
Dean
Prichard2, and
James
Theiler3
1 Department of Neuropharmacology, The Scripps Research
Institute, La Jolla, California 92037, 2 Center for
Adaptive Systems Applications, Los Alamos, New Mexico 87544, and
3 Los Alamos National Laboratory, Los Alamos, New
Mexico 87544
 |
ABSTRACT |
Recent theories of the effects of ethanol on the brain have
focused on its direct actions on neuronal membrane proteins. However, neuromolecular mechanisms whereby ethanol produces its CNS
effects in low doses typically used by social drinkers (e.g., 2-3
drinks, 10-25 mM, 0.05-0.125 gm/dl) remain less well
understood. We propose the hypothesis that ethanol may act by
introducing a level of randomness or "noise" in brain electrical
activity. We investigated the hypothesis by applying a battery of tests
originally developed for nonlinear time series analysis and chaos
theory to EEG data collected from 32 men who had participated in an
ethanol/placebo challenge protocol. Because nonlinearity is a
prerequisite for chaos and because we can detect nonlinearity more
reliably than chaos, we concentrated on a series of measures
that quantitated different aspects of nonlinearity. For each of these
measures the method of surrogate data was used to assess the
significance of evidence for nonlinear structure. Significant nonlinear
structure was found in the EEG as evidenced by the measures of time
asymmetry, determinism, and redundancy. In addition, the evidence for
nonlinear structure in the placebo condition was found to be
significantly greater than that for ethanol. Nonlinear measures, but
not spectral measures, were found to correlate with a subject's
overall feeling of intoxication. These findings are consistent with the
notion that ethanol may act by introducing a level of randomness in
neuronal processing as assessed by EEG nonlinear structure.
Key words:
EEG; ethanol; chaos; surrogate data; time series
analysis
 |
INTRODUCTION |
The neuromolecular basis of the
intoxicating effects of low doses of ethanol is poorly understood. The
lipid theory of the actions of alcohol is based on the fact that the
behavioral potency of aliphatic n-alcohols with up to five
carbon atoms is correlated with both their membrane lipid disordering
potency and their lipid solubility (membrane/buffer partition
coefficient) (Seeman, 1972 ; Trudell, 1977 ; Goldstein, 1984 ; Rall,
1990 ). Recent theories have shifted from a focus on lipids to include a
direct interaction with receptor proteins (Weight, 1992 ; Grant, 1994 ;
Li et al., 1994 ; Peoples et al., 1996 ). Although some effects of
ethanol on receptor proteins, such as NMDA, have been reported
at concentrations produced by pharmacological doses (50-100
mM, 0.25-0.5 gm/dl) (Peoples and Weight, 1995 ), the
effects are yet to be significant at concentrations produced by doses
typically used by social drinkers (e.g., 2-3 drinks, 10-25
mM, 0.05-0.125 gm/dl).
One explanation for the difficulty in demonstrating significant
effects of low doses of ethanol on the neuronal level may be that
ethanol affects both proteins and lipids; however, it may not produce a
clear agonist or antagonist effect but, rather, may introduce an
increased level of randomness in neuronal processing. Several studies
provide data that are descriptively supportive of this idea. For
instance, Aston-Jones et al. (1982) demonstrated that low doses of
ethanol, although having no effect on the mean spontaneous discharge of
rat locus coeruleus neurons, significantly increased the variability in
the latency at which those neurons fired in response to sensory
stimuli. In addition, at the level of the EEG it has been found in both
humans and animals that ethanol increases the variance of EEG and
event-related potential signal amplitudes over time (Ehlers and
Havstad, 1982 ; Ehlers and Reed, 1987 ; Ehlers, 1988 , 1992 ; Ehlers et
al., 1989 ).
To assess this hypothesis, we needed to quantitate the effects of
ethanol on brain and behavior, using measures that reflect aspects of
the dynamical behavior of a system. The present study used a series of
statistical measures derived from chaos theory, including measures of
time asymmetry, determinism, dimension, and redundancy, to evaluate EEG
data collected from 32 subjects participating in an ethanol/placebo
challenge protocol. The method of surrogate data, a recent fundamental
strategy in nonlinear dynamics, was used to assess the evidence for
nonlinearity in these data sets. Using this data set, we
explored the following six questions: (1) Is there evidence for
significant nonlinear structure in the EEG? (2) Do EEGs collected under
a placebo condition differ from those collected after the ingestion of
ethanol? (3) Are nonlinear statistics better than linear statistics for
distinguishing between EEGs collected under the placebo and ethanol
conditions? (4) What measures of nonlinear structure are better in
distinguishing an EEG from its surrogates or placebo from ethanol? (5)
Does ethanol produce increases or decreases in evidence for nonlinear
structure as compared with placebo? (6) Do linear or nonlinear measures of the actions of alcohol on the EEG correlate with a person's subjective report of intoxication?
 |
MATERIALS AND METHODS |
Subjects
The subjects who participated in this study were 18- to
25-year-old male students and nonacademic staff from the University of
California, San Diego. Data were available on men who participated in
one of two different ongoing clinical protocols. These subjects met
individually with research staff to complete a screening questionnaire (Schuckit and Gold, 1988 ) that was used to select individuals who met
the eligibility requirements for the study. Using this highly
structured self-report instrument, we excluded subjects from further
evaluation if they or their first degree relatives met diagnostic
criteria for alcohol or other substance dependence or other major Axis
I psychiatric disorders according to the Diagnostic and
Statistical Manual, Volume III. The screening questionnaire also
was used to gather information on demography, personal medical history,
usual quantity and frequency of alcohol consumption over the previous 6 months, and family history of alcohol and other substance dependence.
Individuals also were excluded from further study if they were taking
prescribed medication, had any major medical condition, or had
abstained from alcohol over the previous 6 months. An extensive
description of the subject selection process has been described
previously (Schuckit, 1985 ; Ehlers and Schuckit, 1991 ; Wall et al.,
1992 , 1993 ). Some of these men had participated in other parts of
larger studies (Ehlers et al., 1989 ; Ehlers and Schuckit, 1990 , 1991 ;
Wall et al., 1992 , 1993 ). Men (n = 32) who met the
final inclusion criteria then were invited to participate individually
in two test sessions, ~1 week apart, that consisted of baseline
evaluations and subsequent challenges with placebo and alcohol.
On both test days each man arrived at the laboratory at approximately
8:00 A.M. after fasting overnight and was provided a standardized
low-fat breakfast. Baseline measurements were taken; at approximately
9:00 A.M. a placebo or alcohol beverage was administered in random
order, using a placebo alcohol administration device (Mendelson et al.,
1984 ). The alcohol beverage was 0.75 ml/kg of 95% alcohol as a
20%-by-volume solution in a caffeine-free and sugar-free soda. The
placebo beverage was made by using the same mixer with 3 ml of 95%
alcohol floated on top. Subjects were instructed to drink at a steady
pace and to consume the beverage over 7 min. EEG data were collected at
90 min after placebo and alcohol. Subjective feelings of intoxication
were measured by using a modified version of the Subjective High
Assessment Scale (SHAS), which consists of 13 items rated on Likert
scales ranging from 0 (normal) to 36 (extreme effect) (see Judd et al.,
1977 ). The total score on this scale was available for each subject and subsequently was used for statistical analyses. Blood samples also were
drawn from a heparinized lock inserted into an antecubital vein for
subsequent determination of blood alcohol concentrations (BACs), using
a modified alcohol dehydrogenase assay.
EEG data were collected from lead P4-02 (right parietal cortex
referenced to right occipital cortex), as described previously, because
in several studies we have found that lead to be the most sensitive to
the effects of ethanol (see Ehlers et al., 1989 ; Wall et al., 1993 ).
Six minutes of EEG data were collected while the subject was relaxed
with eyes closed and with the filters set at 1-70 Hz. The technician
carefully monitored the subject for any signs of drowsiness. Three
minutes of continuous artifact-free nondrowsy EEG data were
computer-analyzed. EEGs were digitized to 12 bits of resolution at 256 samples per second.
All subjects signed informed consent, and the study was approved by the
Scripps and University of California, San Diego, internal review
boards.
Data analyses
Conceptual approach to the data analyses
Nonlinear structure in the EEG after placebo and ethanol
administration was quantified by computing a series of measures on the
EEG. Most nonlinear measures are sensitive to structure in the data but
do not discriminate explicitly between linear and nonlinear structure.
Thus to quantify nonlinear structure in the EEG, we constructed a
"surrogate" time series from the original EEG signal; the surrogate
data mimic the linear structure in the original data, but are otherwise
random. Thus, all but the linear structure is effectively removed. In
comparing data sets with their associated surrogates, we can identify
which have more significant evidence for nonlinear structure. Although
we can choose specific measures (e.g., estimated correlation dimension,
or in-sample root-mean-square error of a nearest-neighbor prediction
algorithm) for quantifying the overall structure in the data, it is
difficult to make general quantitative statements about this structure. When we speak informally about more or less "randomness" in the time series, we generally are referring to more or less significant evidence for nonlinearity in the data. The measures used to quantify nonlinear structure in the EEG included an estimation of correlation dimension, an inverse measure of trajectory density, a measure of
determinism, information redundancy, and time series slope asymmetry.
Linear aspects of the data also were quantified by using the power
spectrum and the autocorrelation function. A brief conceptual
description for these measures is provided below. In the following
section, we will revisit each of the measures with a more detailed
technical description of the methodology.
Time series slope asymmetry. Time series slope
asymmetry is one of the few nonlinear measures that the eye often can
detect in a data set such as the EEG, as seen in Figure
1. The concept of slope asymmetry is
derived from the fact that time series generated from linear processes
(such as sine waves) appear statistically the same whether the data are
viewed as running forward or backward in time. By contrast, nonlinear
systems such as those generated by relaxation oscillators generally
have distinct rise and fall times. For such systems this measure
captures the difference between the rise times (upward part) and fall
times (downward part) of the oscillations in a dynamical system. One of
the simplest ways to measure slope asymmetry is the skewness (third
statistical moment) of slopes, as described below.

View larger version (52K):
[in this window]
[in a new window]
|
Figure 1.
Slope asymmetry in the EEG. Although the slope
asymmetry is computed as a single average of slope skewness (see
Materials and Methods for equation), this figure illustrates how the
slope skewness can vary over the time series. Two 8 sec segments of EEG
are shown in a and c, together with the
slope skewness as a moving average over the preceding 0.5 sec. In
b, EEG and corresponding slope skewness of two brief
sections from a are shown with an expanded time scale,
showing episodes of positive slope asymmetry in each of which a single
high-amplitude EEG wave with very rapid rise, as compared with the
preceding and following fall, produces an abrupt positive shift in
slope skewness. A different pattern is seen in c, where
positive skewness is more continuous, resulting from a steeper rise
than fall over many EEG waves. These are representative of several
observed patterns, all of which produced predominantly positive
skewness of slope.
|
|
Time-delayed embedding. One of the most useful and important
tools in nonlinear time series analysis is the time-delayed embedding (Packard et al., 1980 ). A time series data set describes the
measurement of a single quantity (voltage on a certain electrode, for
the case of EEG) as a function of time, but the underlying system generally requires several distinct variables to fully describe its
state at a given instant in time; this state can be represented as a
(vector-valued) point in an abstract multidimensional "state space," and the position in this state space evolves with time to
trace out a "trajectory." The time-delayed embedding converts the
single-quantity measurements into vector-valued quantities for which
the components are time-delayed values of the Scalar time series. The
important property of this embedding is that it preserves the geometric
properties of this trajectory (such as its fractal dimension) and thus
provides information about the dynamical properties of the system (for
a rigorous mathematical justification, see Takens, 1981 ; Sauer et al.,
1991 ). Many measures in this study (dimension, 1% radius, redundancy,
Kaplan / ) use embedded trajectories as a step toward the
quantification of the behavior of nonlinear systems.
Correlation dimension. One behavior characteristic of
deterministic dynamical systems is that they do not wander uniformly about the entire dynamical space but, instead, settle down (or "self-organize") into a small subspace of the full available
dynamical space within which the system is evolving. This subspace is
the "attractor" of the dynamical system. For chaotic systems the
attractor is sometimes fractal or "strange," but for deterministic
systems it is of lower dimension than the full-state space. This lower dimension indicates the number of active degrees of freedom or, approximately, the number of variables required to model the dynamics on the attractor. The correlation dimension (Grassberger and Procaccia, 1983 ) is one of the most popular ways to estimate the attractor dimension directly from time delay embedded data. Stochastic systems do
not settle onto low-dimensional attractors, so estimating dimension is
a direct way of distinguishing stochastic from deterministic systems.
However, although dimension estimation is appealing conceptually, it is
in practice a method fraught with peril (Theiler, 1990 ; Rapp,
1993 ).
The correlation integral is the first step in the estimation of a
correlation dimension. Although the dimension is the more physically
meaningful quantity, the correlation integral itself provides a more
direct and reliable comparative measure. It is a positive measure,
always less than or equal to one, which counts the fraction of pairs of
points for which the distance between the two members is less than a
certain value.
One percent radius. Instead of counting pairs of points that
are a specified distance apart, the 1% radius is the smallest distance
that is still larger than the smallest 1% of pairwise distances.
Because points generally are clustered more closely in the dynamical
space of nonrandom systems, their 1% radius is smaller than that of
random systems.
Redundancy. Redundancy measures how much duplication of
information occurs in a set of measurements. The process of
self-organization of a dynamical system into repeatable and distinct
forms produces redundancy. A purely random system that is wandering
through dynamical space only repeats its travels by chance. Subsequent
measurements for such a system always provide fresh information about
the state of the system. For nonrandom systems the measurements made in the past provide information that permits approximate and/or
probabilistic predictions of the future state of the system. For these
systems each measurement adds only partial information to what was
already predictable from the past; the measurements, collectively,
exhibit a measure of redundancy.
Kaplan's / . The most straightforward way to identify
determinism is to predict the future from information obtained from the
past and then to wait for the future and see how well the prediction
worked. For example, Scott and Schiff (1995) looked at the
predictability of interictal spikes in epileptic EEGs, and Schiff et
al. (1996) used mutual nonlinear prediction to characterize the
coupling in neural ensembles. This approach, however, is highly sensitive to the actual strategy used for making the predictions in the
first place. Thus measures based on prediction error were not used in
this study. However, Kaplan (1994) has described a measure that
produces a relatively direct measure of "determinism" without
invoking actual predictions. This measure is based on the idea that if
two points are close together on a trajectory, the images of the points
at some short time later are more likely to be close together if the
system is deterministic than if it is not.
Technical description of the data analyses
Segmentation of the time series. To reduce the
effects of possible nonstationarity of EEG time series, we divided each
3 min EEG record into 22 8 sec segments, with nominal start times at intervals of 8 sec. This segment length was chosen as a compromise between the need for short segments, within which the EEG is likely to
be approximately stationary, and the need for long enough segments to
produce larger sample size for improved statistics, particularly with
respect to adequately populating embedded system trajectories. The
number of samples in each segment (2048) is a power of two to simplify
the implementation of the fast Fourier transform algorithm.
Fourier transforms of finite segments of the EEG, used in the formation
of surrogate time series, can introduce spurious high-frequency content. Because the transform treats the segment as one period of a
periodic time series (i.e., as if the time series were an infinite
number of repetitions of the finite segment), a discontinuity between
the ends of the segment is treated as an instantaneous "jump" in
the data, which contributes high-frequency content to the Fourier power
spectrum. This high frequency appears in the surrogate data sets not as
a single discontinuity but as an overall crinkliness that is not
present in the original EEG. To minimize this effect, we adjusted the
starting point of each segment so that the first few samples of the
segment most closely match the first few samples after the segment in
the original EEG time series. The discontinuity
di was calculated for each point in the two seconds (512 sample times) starting with the nominal start point of an
8 sec segment:
|
(1)
|
where i is the index of the initial sample of the
time series xi relative to the nominal start
point, and n is 2048, the number of samples in an 8 sec time
series. The start point i is chosen to minimize this measure
di. Each 8 sec EEG segment was normalized to
zero mean and SD of one, with preservation of the scale factors so that
the original EEG amplitudes could be regenerated. Except for power
spectra, all subsequent processing used the normalized time series.
Each EEG 8 sec segment also was low-pass-filtered digitally at 45 Hz to
reduce muscle artifact while retaining structure related to the EEG.
Because this digital filtering is linear, it cannot introduce nonlinear
structure into data that are not already nonlinear.
Linear measures
Of the various measures calculated in this study, those that are
sensitive only to linear structure in time series are designated linear
measures. The linear content of a time series is described completely
by the power spectrum or, equivalently, by the autocorrelation function.
Power spectrum. The power spectrum, as opposed to
other derived measures in this study, is based on the full
un-normalized EEG signal and thus reflects differences in EEG amplitude
among segments and subjects. The total power in each power spectrum was
calculated by summing all components, and the powers in - and
-bands were calculated by summing components in bands from 4 to 8 Hz
and from 8 to 12 Hz, respectively. - and -bands were selected
because they have been demonstrated in previous studies (see Lukas et
al., 1986 ; Ehlers et al., 1989 ) to differentiate most clearly the EEG
after placebo from that after ethanol. The power in each band was
divided by the total power to provide the -fraction and
-fraction. For the Fourier transforms used in the calculation of
power spectra and in the preparation of surrogate time series,
rectangular windowing was used.
Autocorrelation function. The autocorrelation function of
each EEG and surrogate time series was calculated as the inverse Fourier transform of the power spectrum of the time series.
Autocorrelation time was calculated as the autocorrelation lag at which
the autocorrelation function first decreases to 1/e.
Coherence time is defined as the lag at which the envelope of the peaks
of the autocorrelation function decreases to 1/e (that is,
to ~37% of its initial amplitude), estimated by a least-squares fit
of a straight line to the logarithm of the magnitude of each positive
or negative peak for lags of not more than 500 msec.
Nonlinear measures
Surrogates. Surrogates were generated by the
amplitude-adjusted phase-randomized algorithm described in Theiler et
al. (1992) . The generation of surrogates uses the following steps: (1)
An "amplitude-adjusted" copy of the EEG is prepared by means of a static nonlinear transform so that the time series has a Gaussian distribution of amplitudes; (2) the Fourier transform is calculated; (3) the transform is converted to polar form, phase angles are replaced
by random numbers, and the result is converted back to complex form;
(4) the inverse Fourier transform is calculated; and (5) a static
nonlinear transform is used to convert to the original distribution of
amplitudes. The result is a surrogate time series that uses the
original EEG samples but in a different sequence, with the constraint
that the power spectrum and therefore the linear structure are
preserved, whereas nonlinear structure in a statistical sense is
removed. Figure 2 shows one such EEG segment and five of its surrogates.

View larger version (80K):
[in this window]
[in a new window]
|
Figure 2.
EEG and surrogate data. This figure displays a
sample 8 sec EEG epoch in the top trace and five
surrogate time series generated from it in the traces
below. Because the linear structure of the time series
is preserved in surrogates, the original time series and its surrogates
appear similar by visual inspection. This EEG is representative of the
posterior dominant rhythm of -activity typical of eyes-closed human
EEG. Other patterns with more or less -activity also were
observed.
|
|
Slope asymmetry. The asymmetry of the distribution of the
first time derivative of each EEG or surrogate time series is estimated by the skewness of differences between successive samples,
|
(2)
|
Embedding. With the use of a standard algorithm (see
Packard et al., 1980 ) containing an embedding dimension m
and a delay time , the embedded vector is:
|
(3)
|
For a deterministic system the embedding dimension should be
larger than the dimension of the attractor (Takens, 1981 ; Sauer et al.,
1991 ) to characterize the deterministic dynamics fully. However,
evidence for determinism is often available with a lower embedding
dimension, and in general the required/optimal embedding dimension is
rarely known a priori. As a consequence, this study uses several values
of embedding dimension. The embedding delay time for
this study is five sample periods (19.53 msec), which is approximately
the autocorrelation time of the EEG. Embedding dimensions m
are 1, 4, 8, 16, and 32.
Correlation integral. The order two correlation integral,
used in the calculation of several measures, is:
|
(4)
|
where r is a distance, m is the
embedding dimension, xi xj is the distance between points
xi and xj,
N is the number of pairs of points,
xi and xj, used,
and is the Heaviside function, which has the value 1 if the
expression in the inner parentheses has a value greater than zero, and
zero otherwise. Summation of the Heaviside function in this equation
counts the number of point pairs separated by distance less than
r, and division by N produces the correlation
integral, the fraction of points separated by distance less than
r at embedding dimension m. To avoid an error caused by points closely spaced on the same orbit of the trajectory, the difference in magnitude of i j must
be greater than a fixed value W (Theiler, 1986 ) for which 50 sample periods were used. This value, corresponding to ~200 msec, is
greater than the (100 msec) period of the -rhythm, which is the main
source of trajectory periodicity. The correlation integral at embedding
dimension m and for radius r is the fraction of
pairs of points not on the same loop of the trajectory for which the
separation is not greater than r.
The solid curves of Figure 3 are an
example of correlation integral as a function of radius, for various
values of embedding dimension, calculated for embedding dimension
m = 1, 4, 8, 16, and 32, for 128 values of radius from
1/64 to 2 SDs of the time series. Maximum (L )
norm is used for the calculation of distances.

View larger version (21K):
[in this window]
[in a new window]
|
Figure 3.
Log of correlation integral
C2 as a function of log of radius
r, for various values of embedding dimension
m. The value of the correlation integral at radius
r is the average fraction of all points of the
trajectory lying within an m-dimensional cube of radius
r, with the cube centered on a point on the trajectory.
For ideal noise-free stationary low-dimensional systems, the log
correlation integral should decrease linearly with decreasing log of
radius at small radius; the slope of this scaling region is the
correlation dimension, estimated here by the slope of the dashed
straight lines fit to the curves. For curves at embedding
dimension sufficiently greater than the correlation dimension, the
slopes should saturate at a constant estimate of correlation dimension.
The curves shown here, for a single 8 sec segment of EEG, are
representative of many that show little evidence of a low-dimensional
attractor but that nevertheless have slopes less than those for
surrogates of the EEG.
|
|
The correlation integral also is used in the calculation of the
correlation dimension, 1% radius, Kaplan's / , and the order-two redundancy.
Correlation dimension. The correlation dimension is
calculated from the correlation integral (Grassberger and Procaccia,
1983 ). Although the correlation dimension
D2(m) is defined in terms of a limit
as radius goes to zero, it is estimated by the slope of the log of
correlation integral versus log radius, as shown in Figure 3. The
estimated slopes, and the range over which they are calculated, are
shown as dotted lines. Slope is estimated by a least-squares fit of a
straight line to points on the curve over a range of radius such that
the smallest value of radius is that for which the correlation integral
is greater than for the next smaller value of radius, and the largest
value of radius is the smallest value for which the correlation
integral is at least 1000 times its smallest value. Other methods to
estimate this slope have been suggested by Takens (1981) , Ellner
(1988) , and Theiler and Lookman (1993) .
One percent radius, r1%(m).
The radius at a particular value of correlation integral and
embedding dimension is the size of an m-dimensional cube
containing, on average, the fraction of points given by the correlation
integral and is therefore an inverse measure of the density of points
on the embedded trajectory. The measure
r(m,C2) for
m = 4, 8, 16, and 32, at C2 = 0.01, will be called the 1% radius at embedding dimension
m, r1%(m). The
trajectories of more obviously deterministic systems are likely to be
more dense, with smaller 1% radius, than those of systems in which the
complexity provides less evidence of determinism.
Redundancy. Information theoretic methods in nonlinear time
series analysis were advocated first by Shaw (1981) and popularized by
Fraser and Swinney (1986) and Fraser (1989) over a decade ago. More
recently, Palus (1995 , 1996b ) and Palus and colleagues (1993) have
promoted the use of "redundancy" as an information
theoretic measure of determinism.
One may begin with a measure of entropy H(1,r),
which is the average number of bits needed to describe a single value
of the time series to a precision r. For instance, if
pr is the fraction of time series values between
x and x + r, then:
|
(5)
|
and:
|
(6)
|
Let H(m,r) be the average
number of bits needed to describe a sequence of m values
(equivalently, the number of bits in an m-dimensional
embedding x ). In general,
|
(7)
|
because describing a sequence of m values cannot take
more bits than describing m individual values. In fact, one
can usually describe the sequence with considerably fewer bits, because
the time series contains some redundancy. This redundancy is defined formally by:
|
(8)
|
The definition of entropy in the equation above for
H(1,r) has many useful properties, but Prichard and
Theiler (1995) argue that a definition based on the correlation
integral has some advantages, particularly from the viewpoint of
computational accuracy from a finite set of points. Here,
|
(9)
|
and redundancy for the embedded trajectory is:
|
(10)
|
The range of radius over which entropy calculations are valid
depends on embedding dimension, and this range may not overlap at large
and small embedding dimensions. At high embedding dimension and small
radius the correlation integral may be zero, resulting in infinite
entropy, or large radius values at low embedding dimension may approach
or exceed the size of the trajectory so that entropy is not effectively
a function of radius. To avoid these cases, we calculated redundancy at
embedding dimension m by means of entropies at embedding
dimensions m and n, where n is less
than m, using the variant definition:
|
(11)
|
which reduces to the standard definition for
n = 1. For m = 16, n = 4 was used, and for m = 32, n = 8. Results are the measures R'2(m,r) for
m = 4, 8, 16, and 32 and r = 0.5, 1.0, 1.5, and 2.0, of which the results are used only for those
values of r for which there was overlap for the stated
combinations of m and n.
Kaplan's /
Let the distance between two points at time t
be:
|
(12)
|
where the double bars indicate the maximum
norm, and let the distance between the images of the points at
k time steps later be:
|
(13)
|
where k is the number of sample periods in time
interval T. For points closer together than r,
the average separation of their images at time T later
is:
|
(14)
|
where Nr is the number of point pairs
contributing to the average at the specified value of r.
Er is the average separation between all pairs
(xi, xj) of
points that, k time steps earlier, were separated by a
distance i k,j k r.
The window W of excluded points is 50, as in the calculation
of the correlation integral, to avoid pairs of points on the same loop
of the trajectory. The average distance Er is
calculated from the equation above for 0 r < 8, accumulating sums in 256 bins of width 1/32, and k = 26 sample periods (T = 101.56 msec).
Nondeterministic systems are expected to have values of
i,j that are independent of previous separations
i,j, so a plot of
Er versus r will tend to be a line
with zero slope. For deterministic systems, Er
is expected to be small for small r, increasing with increasing r to a maximum that is dependent on the size of
the trajectory. Figure 4 is an example of
this plot. The part of the curve dependent on r is
characterized by a least-squares fit of a straight line
Er ~A + Br, where
A is the intercept of the fit line at r = 0, and B is the slope. Points to be fit by the straight line
are weighted by the number of point pairs Nr
contributing to the cumulative average. All sets of consecutive points
from the first nonzero point to from 1/6 to 1/2 of the remaining points then are tested. The set with the least residual error from the straight line fit is used as the best fit. Results are the measures identified in Table 1 as A(m), the intercept of the function at
r = 0, and B(m), the slope.
Dashed lines in Figure 4 show the estimated slopes B and
intercept A at r = 0.

View larger version (18K):
[in this window]
[in a new window]
|
Figure 4.
Kaplan's / measure of determinism for a
representative segment of EEG. For pairs of point pairs separated by
not more than distance r at time t, the
mean separation between their images at t + 100 msec is
E(r) for various values of
embedding dimension m. Straight lines
E = A + Br are fit to
the curves, as shown by the dashed lines. For completely
nondeterministic systems (i.e., white noise) the slope B
will be close to zero, and the intercept A will be close
to the value of E at large r. Decreasing
values of intercept and increasing slope provide increasing evidence of
deterministic structure. An ideal noise-free low-dimensional
deterministic system would be expected to produce an intercept of zero,
at least in the limit of infinite data, and a well defined positive
slope.
|
|
Statistical analyses
Statistical tests are described with respect to questions that
they are intended to examine.
(1) Is there evidence for significant nonlinear structure in the EEG?
This question was examined by comparing EEG data and their surrogates,
using parametric and nonparametric tests. Because surrogates lack
nonlinear structure that may be present in the EEG from which the
surrogates are created, statistical tests for comparison of EEG values
with those of its surrogates, for measures sensitive to nonlinear
structure, serve as indicators of nonlinear structure in the EEG. Such
tests require special consideration because variation among subjects,
variation among 8 sec EEG segments for each subject, and variation
among surrogates for each 8 sec segment provide separate contributions
to the total variance. To avoid the pooling of these variances, we
conducted three tests: two (one nonparametric and one parametric) based
on variance among surrogates and one (parametric) based on variance
among subjects.
The nonparametric rank-based test calculates the rank of the EEG value
for each measure with respect to the combined EEG and its 39 surrogate
values for each of 22 8 sec segments for each subject and condition.
The rank is a number in the range 1-40. The null hypothesis that there
is no difference between the EEG and its surrogates can be rejected
with a confidence level of 95% if the EEG rank is 1 or 40 (see
Barnard, 1963 ; Hope, 1968 ). A corresponding parametric test uses the
mean difference between the EEG and its surrogates, and the SD of this
mean difference, over the 39 surrogates of the EEG to calculate a
t value and corresponding probability in a two-tailed
Student's t test for the same null hypothesis as for the
nonparametric test. For each test the number of 8 sec segments meeting
the criterion p < 0.05 is counted and averaged over
subjects, yielding the mean number of "significantly nonlinear"
segments per subject, with a range of 0-22. The third test provides an
overall probability value for each measure and condition by calculating
an overall mean surrogate difference and the SE of this mean
difference, from which the t statistic and its corresponding
two-tailed p value are calculated. The overall mean
surrogate difference is the mean over subjects of the subject mean
surrogate differences, which are averages of the 22 8 sec segment mean
surrogate differences for each subject. Because this test uses only the
variance among subjects, it has 31 degrees of freedom.
(2) Do EEGs collected under a placebo condition differ from those
collected after ingestion of alcohol? To test this hypothesis, we used
paired t tests to compare data from the ethanol condition with the placebo condition for each of the measures in each
subject.
(3) Are nonlinear statistics better than linear statistics for
distinguishing between EEGs collected under the placebo and ethanol
conditions? Discriminant function analyses were used to test this
hypothesis.
(4) What measures of nonlinear structure are better in distinguishing
an EEG from its surrogates or in distinguishing placebo from ethanol?
Discriminant function analyses were used for this purpose, predicting
membership in two mutually exclusive groups (ethanol vs placebo, or EEG
vs surrogates) from a set of linear and nonlinear measures that were
used as predictors.
(5) Does ethanol produce increases or decreases in evidence for
nonlinear structure as compared with placebo? For this test the
variable analyzed for each measure was the number for each subject of 8 sec segments meeting the criterion that the EEG rank with respect to
combined values of the EEG and its surrogates was 1 or 40. The values
for this variable were used in paired Wilcoxon signed rank tests, with
the pairing of placebo and ethanol values for each subject. This
nonparametric procedure avoids assumptions regarding the distribution
of variables.
(6) Do linear or nonlinear measures of the actions of alcohol on the
EEG correlate with a person's subjective report of intoxication? Forward selection stepwise regression was used to compare values obtained on the SHAS to linear and nonlinear EEG measures concurrently obtained during the alcohol session. Significance on this test was set
at p < 0.01.
 |
RESULTS |
The EEG contains significant evidence of nonlinear structure
The results from the three statistical tests for differences
between the EEG and its surrogates are shown in Table 1. For surrogate
differences (EEG values minus those of its surrogates), the
p values indicate that both placebo and ethanol EEGs differ with high confidence from surrogates for all nonlinear measures except
correlation dimension. Because the difference between the EEG and its
phase-randomized amplitude-adjusted surrogates is that the latter lack
nonlinear structure, it is concluded that the EEG must contain such
structure. The null hypothesis underlying the use of this type of
surrogates is that the EEG is linearly filtered Gaussian noise to which
a nonlinear static transform may have been applied. Results from these
tests indicate, therefore, that the EEG is not modeled effectively as
linearly filtered Gaussian noise.
Under the null hypothesis the expected number per subject of
significantly nonlinear (using the traditional p < 0.05 criterion) 8 sec segments is 0.05 × 22 = 1.1, and
one-sided critical values for p < 0.01 and
p < 0.001 are 1.55 and 1.7, respectively. It is shown
in Table 1 that the mean number of significantly nonlinear segments
corresponds to p < 0.001 for all nonlinear measures
except correlation dimension; this holds for both parametric and
nonparametric tests and for both placebo and for ethanol. Correlation
dimension fails to achieve significance at any embedding dimension for
placebo or for ethanol. These tests, based on variation among
surrogates, are consistent with the test on the basis of variance among
subjects in providing evidence for nonlinear EEG structure.
Mean numbers of significantly nonlinear segments with the conservative
nonparametric test are only slightly smaller than for the parametric
test. This indicates that distributions among surrogates are not
unusual and that the assumptions of the t test are
reasonable for these data. Interestingly, the mean number of
significantly nonlinear segments provides inconsistent advice regarding
the difficult question of optimum embedding dimension. For 1% radius the greatest evidence for nonlinear structure is seen at the highest embedding dimension, but for redundancy and Kaplan's cumulative measures the evidence is greatest at the lowest embedding
dimension.
Slope asymmetry
Mean surrogate difference for slope asymmetry is shown in Table 1.
It was found to be significant and positive for both placebo and
ethanol. Figure 1 shows representative EEG waveforms and corresponding slope asymmetry in which asymmetry is seen to be dominantly positive. Slope asymmetry values for surrogates were found to be consistently near zero, providing confirmation of negligible nonlinear structure in
surrogates.
Correlation dimension
The mean number of significantly nonlinear segments for
correlation dimension, in Table 1, are not notably greater than
expected if the surrogate difference were zero but are sufficient to
provide statistical significance of the mean surrogate difference for ethanol and for placebo at high embedding dimension. The negative values of surrogate difference indicate that estimates of trajectory dimension for surrogates are greater than for the EEG. The slopes of
correlation integral curves, in Figure 3, are consistent with these
results in not providing evidence for low-dimensional attractors.
One percent radius
The radius of cubes containing 1% of points is seen from
surrogate difference to be greater for surrogates than for the EEG at
all values of embedding dimension, indicating that EEG trajectories are
denser than those of its surrogates.
Redundancy
The mean surrogate difference for redundancy was found to be
positive at all values of embedding dimension and radius for both
placebo and for ethanol, indicating that redundancy for the EEG is
greater than for its surrogates. The mean number of significantly nonlinear segments was found to be the greatest at smallest radius at
each value of embedding dimension.
Kaplan's /
The surrogate difference for cumulative intercept A
is negative and for slope B is positive at all embedding
dimension values, indicating that the EEG is more deterministic than
its surrogates for both placebo and ethanol.
The effects of ethanol can be distinguished from placebo by
many measures
The EEG after placebo could be distinguished from that after the
ingestion of a low dose of ethanol by most of the EEG measures, with
and without surrogates. The measures that significantly differentiated ethanol from placebo are illustrated in Figure
5. Of five linear measures the coherence
time, -fraction, and -fraction, but not autocorrelation time or
total power, were found to differentiate ethanol significantly from
placebo. Among the nonlinear measures 16 of 25 were significant, with
correlation dimension at any embedding dimension and redundancy at
embedding dimension of 4 and 8 failing to make the differentiation. In
the matter of optimum embedding dimension, on the basis of the
significance of these tests, the choice among those used would probably
be m = 16.

View larger version (37K):
[in this window]
[in a new window]
|
Figure 5.
The effects of ethanol on the EEG. The EEG that
follows placebo could be distinguished clearly from that after the
ingestion of a low dose of alcohol by most of the measures. The
measures that significantly differentiated ethanol from placebo are
illustrated in this figure. White bars, Placebo;
black bars, ethanol; lined white bars,
placebo surrogates; lined black bars, ethanol
surrogates. Three of five linear measures were able to distinguish
ethanol from placebo. Among the nonlinear measures, 16 of 25 were
significant. *p < 0.01; **p < 0.001; ***p < 0.0001; paired t
tests.
|
|
Multivariate analysis identifies measures of asymmetry, trajectory
density, and determinism as the most able to detect nonlinear structure
in the EEG
A discriminant function analysis was used to determine whether a
combination of nonlinear measures could account for most of the
variance in nonlinear structure in the EEG. Mean values for each of the
nonlinear measures for surrogates and the EEG were entered by using a
forward selection procedure, and the model selected slope asymmetry,
Kaplan's / slope at embedding 16, Kaplan's / slope at
embedding 4, and 1% radius at embedding dimension m = 4 as the best combination to discriminate the EEG from its surrogates.
This model was found to be significant (Wilks F = 32.2; df = 1, 59; p < 0.00001). The jackknifed
classification matrix indicates that the model was 94% correct in
classifying the EEG from its surrogates. The model classified four EEG
segments as surrogates but classified 0 surrogates as EEG, as seen in
Table 2.
View this table:
[in this window]
[in a new window]
|
Table 2.
Discriminant function analysis: EEG versus surrogate:
nonlinear measures model (jackknifed classification matrix)
|
|
Multivariate analysis identifies those linear and nonlinear
measures best at discriminating between ethanol and placebo
Discriminant function analyses also were used to determine which
measures could account for the largest part of the variance between
EEGs collected during alcohol and placebo conditions in separate tests
that used linear and nonlinear measures. Among nonlinear measures the
model chose only two, both measures of redundancy: at radius of 2 and
embedding 16 and at radius of 1 and embedding 4. The model was
significant (Wilks F = 5.11; df = 2, 61;
p < 0.009), with a jackknifed classification rate of
66% correct. Of linear measures the model selected only fraction, which was also significant (Wilks F = 6.87; df = 1, 62; p < 0.01), and correctly classified 66% of
the variance. When both linear ( fraction) and nonlinear measures
(redundancy) were combined, the resultant model was not significant at
the p < 0.01 level (Wilks F = 3.56; df = 3, 60; p < 0.02) although it
classified 64% correctly, as seen in Table
3. Although both of these models are,
strictly speaking, significant, the classification rates are only
slightly better than the 50% that would be expected by chance.
View this table:
[in this window]
[in a new window]
|
Table 3.
Discriminant function model for ethanol versus placebo:
nonlinear measures model (jackknifed classification matrix)
|
|
Ethanol reduces evidence for nonlinear structure in the EEG
To test whether ethanol produced an increase or a decrease in
nonlinear structure in the EEG, we compared the evidence for nonlinearity under the placebo condition with that for the alcohol condition. This was accomplished by comparing the results for the
nonparametric rank test of EEG-surrogate differences for the ethanol
and placebo condition, using the Wilcoxon paired samples signed rank
test. As shown in Figure 6, the placebo
condition contained significantly more evidence for nonlinear
structure, when compared with ethanol, as determined by this test for
the following measures: slope asymmetry (T = 90;
p < 0.002), correlation dimension at embedding 4 (T = 106; p < 0.004), redundancy at embedding 4 and radius of 0.5 (T = 106;
p < 0.004) and 1.0 (T = 125;
p < 0.01), and redundancy at embedding 32 and radius
of 1.5 (T = 115; p < 0.01). That
ethanol reduces evidence for nonlinear structure is also evident by an
inspection of the last four columns of Table 1. The mean number of
significantly nonlinear segments is larger for placebo than for ethanol
for 20 of 25 measures for the t tests and for 23 of 25 measures for the rank tests. (Neglecting the four correlation dimension
tests, none of which identified a significant number of nonlinear
segments, these numbers become 19 of 21 for the t tests and
21 of 21 for the rank tests.)

View larger version (14K):
[in this window]
[in a new window]
|
Figure 6.
Ethanol reduces evidence for nonlinearity. To test
whether ethanol produced an increase or a decrease in nonlinear
structure in the EEG, we compared the evidence for nonlinearity under
the placebo condition with that for the ethanol condition. This was
accomplished by using a nonparametric rank test for each of the 8 sec
segments of EEG for each subject and comparing, for each of the
nonlinear measures, the values obtained from EEG with those from its
surrogates for both the placebo and ethanol conditions. Then the number
of segments for which a rank of 1 or 40 was obtained for the ethanol
and placebo conditions was compared, using the Wilcoxon paired
samples signed rank test. The placebo condition contained significantly
more evidence for nonlinear structure, as measured by slope asymmetry
(T = 90; p < 0.002),
correlation dimension at embedding 4 (T = 106;
p < 0.004), redundancy at embedding 4 and radius
of 0.5 (T = 106; p < 0.004)
and 1.0 (T = 125; p < 0.01),
and redundancy at embedding 32 and radius of 1.5 (T = 115; p < 0.01). *p < 0.01;
**p < 0.001.
|
|
Nonlinear measures of the actions of alcohol on the EEG correlate
with subjective reports of intoxication
The total score from the SHAS for the alcohol session was compared
with the EEG measures obtained at that same time period (90 min after
alcohol consumption) in a forward selection stepwise manner, using
regression analyses. A model that compared all of the spectral-based
measures ( -fraction, -fraction, total power, autocorrelation)
found no significant correlations at the p < 0.01 or
p < 0.05 levels. None of the nonlinear measures showed a correlation that was significant at the p < 0.01 level; the most nearly significant result was for the Kaplan /
measure of intercept at embedding 4 and 32 (F = 5.164;
p < 0.013). However, in testing specifically whether
the nonlinear structure could account for a correlation between the EEG
and the SHAS by using surrogate-EEG differences, we found several of
the nonlinear measures to correlate significantly with the SHAS: 1%
radius at embedding dimension m = 1 and 4 (F = 5.2; p < 0.01); Kaplan's /
intercept at embedding 4 (F = 7.1; p < 0.01), and redundancy at embedding 4 and 16 (F = 6.7;
p < 0.002).
 |
DISCUSSION |
There has been considerable interest, over at least the last 15 years, in exploiting techniques developed from nonlinear dynamics for
characterizing biological systems (see Mackey and Glass, 1977 ; Mandell,
1984 ; Goldberger et al., 1990 ; Kaplan and Cohen, 1990 ; Garfinkel et
al., 1992 ; Glass and Kaplan, 1993 ; Lopes da Silva et al., 1994 ; Schiff
et al., 1994a ; Gottschalk et al., 1995 ; Kaplan and Glass, 1995 ; Glass,
1997 ). A large number of studies have focused on the evaluation of the
EEG (see Rapp et al., 1989 ; Ehlers et al., 1991 , 1995 ; Pijn et al.,
1991 ; Röschke, 1992 ; Mann et al., 1993 ; Pradhan and
Narayana-Dutt, 1993 ; Ferri et al., 1996 ; Stam et al., 1996 ) (for
review, see Jansen, 1991 ; Pritchard and Duke, 1992 ). A number of
authors have suggested that chaotic behavior may be present in brain
electrical activity (Babloyantz et al., 1985 ; Rapp et al., 1985 ;
Babloyantz and Destexhe, 1986 ; Skarda and Freeman, 1987 ; Mayer-Kress et
al., 1988 ; Röschke and Basar, 1988 ; Pezard et al., 1992 ;
Röschke and Aldenhoff, 1992 , 1993 ; Fell et al., 1993 , 1996a ,b ;
Achermann et al., 1994a ; Meyer-Lindenberg, 1996 ; Röschke et al.,
1997 ), although some of these results recently have come under scrutiny
(see Havstad and Ehlers, 1989 ; Rapp, 1993 ; Rapp et al., 1993 ; Achermann
et al., 1994b ; Theiler, 1995 ; Pritchard et al., 1996 ; Theiler and Rapp,
1996 ).
Although even simple dimension algorithms usually can estimate the
dimensions of actual low-dimensional nonlinear systems, these
algorithms also often report spurious low dimensions for data sets that
arise from systems that are linear and/or stochastic. This difficulty
has led to a renewed interest in the simpler problem of distinguishing
linear from nonlinear dynamical systems. There are several aspects of
nonlinear systems that can be quantified (see Theiler, 1990 , 1994 ;
Grassberger et al., 1991 ; Casdagli, 1992 ; Kantz and Schreiber, 1997 ).
The method of surrogate data (see Kaplan and Cohen, 1990 ; Theiler et
al., 1992 ; Prichard and Theiler, 1994 ; Rapp et al., 1994 ; Schreiber and
Schmitz, 1996 ; Theiler and Prichard, 1996 , 1997 ; Chan, 1997 ; Kantz and
Schreiber, 1997 ; Schreiber, 1998 ) is a relatively new approach. This
method compares a data set of interest with a series of surrogate data sets that are constructed in such a way as to be as "random" as possible; however, they "contain" all of the linear properties of
the original data set. A comparison of original and surrogate times
series, using several measures of nonlinear structure, can help to
determine whether a system contains nonlinear deterministic structure
rather than being simply linearly correlated noise. Using this
technique, some investigators have reported no significant differences
between the EEG and its surrogates (Kaplan and Cohen, 1990 ; Kaplan and
Glass, 1992 ; Glass et al., 1993 ; Palus et al., 1993 ) or only small
differences (Soong and Stuart, 1989 ; Theiler et al., 1992 ; Achermann et
al., 1994b ; Prichard and Theiler, 1994 ; Casdagli et al., 1997 ).
Recently, in three studies the EEG times series was able to be
distinguished from linearly filtered noise with high significance
(Pritchard et al., 1995 ; Rombouts et al., 1995 ; Theiler and Rapp,
1996 ). However, those investigators also found no evidence for a
low-dimensional attractor in the EEG. The present investigation
confirms these findings.
Estimation of dimension was found to be one of the least sensitive
measures of nonlinear structure in the EEG. The correlation integral,
however, was still found to be useful for discrimination between the
EEG and its surrogates as well as for several measures (trajectory
density, redundancy) derived from it. The 1% radius, a measure that
estimates how the points in an attractor are distributed over a section
of phase space by estimating characteristic lengths or distances
between points, was much more sensitive in distinguishing between the
EEG and surrogates than dimension. Because many investigators rely on
the estimation of dimension as a primary variable in investigating whether the EEG can be distinguished from linear filtered noise, this
may be one reason that some of the early results were less than
convincing. Palus (1996a) also has scrutinized the use of dimensional
estimates and Lyapunov exponents in this regard.
Several measures to estimate nonlinearity that have been applied less
routinely to neurophysiological systems were investigated. Kaplan's
/ method (1994 ) with cumulative measure E is based on the idea that, if two points are close together on a trajectory, their images at some time later are more likely to be close if the
system is deterministic than if it is not. This measure was found to
discriminate quite clearly between the EEG and its surrogates. Other
authors have used measures of determinism to test whether simple
neuronal circuits such as monosynaptic spinal cord reflexes in the cat
or rat hippocampal brain slices (Chang et al., 1994 ; Schiff et al.,
1994b ; Aitken et al., 1995 ) display deterministic structure.
Interestingly, in those studies (see Chang et al., 1994 ) it was found
that variability in spinal cord reflexes was stochastic in an isolated
spinal segment but was more deterministic if the segment was not
isolated. The authors argue in their interpretation of these results
that determinism in brain function may result from the integrative
activity of a very large number of neurons and that the process of
is |