Abstract
Entropy, as it relates to dynamical systems, is the rate of information production. Methods for estimation of the entropy of a system represented by a time series are not, however, well suited to analysis of the short and noisy data sets encountered in cardiovascular and other biological studies. Pincus introduced approximate entropy (ApEn), a set of measures of system complexity closely related to entropy, which is easily applied to clinical cardiovascular and other time series. ApEn statistics, however, lead to inconsistent results. We have developed a new and related complexity measure, sample entropy (SampEn), and have compared ApEn and SampEn by using them to analyze sets of random numbers with known probabilistic character. We have also evaluated crossApEn and crossSampEn, which use cardiovascular data sets to measure the similarity of two distinct time series. SampEn agreed with theory much more closely than ApEn over a broad range of conditions. The improved accuracy of SampEn statistics should make them useful in the study of experimental clinical cardiovascular and other biological time series.
 probability
 nonlinear dynamics
nonlinear dynamical analysis is a powerful approach to understanding biological systems. The calculations, however, usually require very long data sets that can be difficult or impossible to obtain. Pincus (21, 22) devised the theory and method for a measure of regularity closely related to the Kolmogorov entropy, the rate of generation of new information, that can be applied to the typically short and noisy time series of clinical data. This family of statistics, named approximate entropy (ApEn), is rooted in the work of Grassberger and Procaccia (10) and Eckmann and Ruelle (5) and has been widely applied in clinical cardiovascular studies (3, 68,1219, 24, 26, 29, 3234).
The method examines time series for similar epochs: more frequent and more similar epochs lead to lower values of ApEn. Informally, givenN points, the family of statistics ApEn(m, r, N ) is approximately equal to the negative average natural logarithm of the conditional probability that two sequences that are similar for m points remain similar, that is, within a tolerance r, at the next point. Thus a low value of ApEn reflects a high degree of regularity. Importantly, the ApEn algorithm counts each sequence as matching itself, a practice carried over from the work of Eckmann and Ruelle (5) to avoid the occurrence of ln (0) in the calculations. This step has led to discussion of the bias of ApEn (22, 23, 27). In practice, we find that this bias causes ApEn to lack two important expected properties. First, ApEn is heavily dependent on the record length and is uniformly lower than expected for short records. Second, it lacks relative consistency. That is, if ApEn of one data set is higher than that of another, it should, but does not, remain higher for all conditions tested (22). This shortcoming is particularly important, because ApEn has been repeatedly recommended as a relative measure for comparing data sets (2224).
To reduce this bias, we have developed and characterized a new family of statistics, sample entropy (SampEn), that does not count selfmatches. SampEn is derived from approaches developed by Grassberger and coworkers (2, 911). SampEn(m, r, N ) is precisely the negative natural logarithm of the conditional probability that two sequences similar for m points remain similar at the next point, where selfmatches are not included in calculating the probability. Thus a lower value of SampEn also indicates more selfsimilarity in the time series. In addition to eliminating selfmatches, the SampEn algorithm is simpler than the ApEn algorithm, requiring approximately onehalf as much time to calculate. SampEn is largely independent of record length and displays relative consistency under circumstances where ApEn does not.
CrossApEn is a recently introduced technique for analyzing two related time series to measure the degree of their asynchrony (20, 28). CrossApEn is very similar to ApEn in design and intent, differing only in that it compares sequences from one series with those of the second. Because it does not compare a series with itself, bias from selfmatches does not arise. A potential problem, however, remains in the necessity for each template to generate a defined, nonzero probability. Thus each template must find at least one match form + 1 points, or a probability must be assigned to it according to a “correction” strategy. We tested the effect of two extremes of correction strategies on crossApEn analysis. We find that crossApEn analysis lacks relative consistency, and conclusions about relative synchrony of pairs of time series depend on the unguided selection of analysis schemes. CrossSampEn, on the other hand, is defined as long as one template finds a match, and we find that crossSampEn remains relatively consistent for conditions where crossApEn does not.
THEORY
ApEn reports on similarity in time series.
We employ the terminology and notation of Grassberger and Procaccia (10), Eckmann and Ruelle (5), and Pincus (21) in describing techniques for estimating the Kolmogorov entropy of a process represented by a time series and the related statistics ApEn and SampEn. The parametersN, m, and r must be fixed for each calculation.N is the length of the time series, m is the length of sequences to be compared, and r is the tolerance for accepting matches. It is convenient to set the tolerance as r × SD, the standard deviation of the data set, allowing measurements on data sets with different amplitudes to be compared. Throughout this work, all time series have been normalized to have SD = 1.
We proceed as follows: For a time series of N points, {u( j): 1 ≤ j ≤ N } forms the N −m + 1 vectors x
_{m}(i) for {i‖1 ≤ i ≤ N − m + 1}, where x
_{m}(i) = {u(i + k): 0 ≤ k ≤ m − 1} is the vector of mdata points from u(i) tou(i + m − 1). The distance between two such vectors is defined to be d[x(i),x( j)] = max {‖u(i + k) − u( j + k)‖:0 ≤ k ≤ m − 1}, the maximum difference of their corresponding scalar components. LetB
_{i} be the number of vectorsx
_{m}( j) within r ofx
_{m}(i) and letA
_{i} be the number of vectorsx
_{m + 1}( j ) withinr of x
_{m + 1}(i). Define the function
Pincus (21) saw that the calculation of Φ^{m}(r) − Φ^{m + 1}(r) for fixed parametersm, r, and N had intrinsic interest as a measure of regularity and complexity. He defines the related parameter ApEn(m, r) = lim_{N → ∞}[Φ^{m}(r) − Φ^{m + 1}(r)], which for finite data sets is estimated by the statistic ApEn(m, r, N ) = Φ^{m}(r) − Φ^{m + 1}(r). Algebraic manipulation reveals that ApEn(m, r,N) = (N − m + 1)^{−1}
ApEn(m, r, N) is biased and suggests more similarity than is present.
It is important to note that ApEn takes a templatewise approach to calculating this average logarithmic probability, first calculating a probability for each template. The ApEn algorithm thus requires that each template contribute a defined, nonzero probability. This constraint is overcome by allowing each template to match itself. Formally, becaused[x
_{m}(i),x
_{m}(i)] = 0 ≤ r, the ApEn algorithm counts each template as matching itself, a practice we will refer to as selfmatching. This ensures that the functions
To discuss the bias caused by including selfmatches, let us redefine the conditional probability associated with the templatex _{i}(m) by lettingB _{i} denote the number of vectorsx _{m}( j) with j ≠i, such thatd[x _{m + 1}(i),x _{m + 1}( j)] ≤r by A _{i}. The ApEn algorithm thus assigns to the template x _{i}(m) a biased conditional probability of (A _{i} + 1)/(B _{i} + 1), which is always greater than the unbiasedA _{i}/B _{i}. In the limit as N approaches infinity, A _{i}and B _{i} will generally be large, making the biased and unbiased probabilities asymptotically equivalent. Therefore, this bias is evident only for the analysis of finite data sets and is a characteristic of the statistic ApEn(m, r, N ), rather than the parameter ApEn(m, r). For a finite N, however, the result is that ApEn(m, r, N ) is biased toward lower values of ApEn and returns values below those predicted by theory.
The largest deviation occurs when a large proportion of templates haveB _{i} = A _{i} = 0, since these templates are assigned a conditional probability of 1, corresponding to perfect order. Furthermore, the difference between the biased and unbiased conditional probabilities assigned to individual templates makes the calculation sensitive to record length in a way that depends on the conditional probability. Suppose that the unbiased conditional probability is known and denote it by CP. For a givenu _{m}(i) letB _{i} denote the number of template matches without counting selfmatches. The original algorithm estimates CP as (1 + A _{i})/(1 + B _{i}) = (1 + B _{i} × CP)/(1 + B _{i}). The fractional error of this relative to CP is Err = {[(1 + B _{i} × CP)/ (1 + B _{i})] − CP} /CP). To find the value of B _{i} necessary to keep the fractional error below a threshold Err_{max}, we isolateB _{i} from the inequality Err ≤ Err_{max}, yielding B _{i} ≥ [1 − CP(Err_{max} + 1)]/(CP × Err_{max}). For independent, identically distributed (iid) random numbers, obtaining B _{i} matches of lengthm requires a data set containing, on average,B _{i}/(CP)^{m} templates. For iid random numbers and m = 2, estimating CP = 0.368 [ApEn(m, r, N ) = 1] within Err_{max} = 0.05 requires B _{i} ≥ 33 and a data set of >240 points. Estimating CP = 0.135 [ApEn(m, r, N ) = 2] with similar resolution requires B _{i} ≥ 127 and >6,900 points.
The most straightforward way to eliminate the bias would be to remove selfmatching from the ApEn algorithm, leaving it otherwise unaltered. However, without the inclusion of selfmatches, the ApEn algorithm is not defined unless
Can the bias be corrected?
It is suggested that this bias can be reduced with a family of estimators ε by defining
These approaches reduce the bias in the estimation of the individual conditional probabilities and ensure that perfect order would not be reported where none had been detected. None of the corrections eliminates bias; the bias is minimized only if ε_{1}/ε_{2} = CP, but CP is not known beforehand. Thus no family of estimators ε that minimizes bias can be chosen a priori.
SampEn statistics have reduced bias.
We developed SampEn statistics to be free of the bias caused by selfmatching. The name refers to the applicability to time series data sampled from a continuous process. In addition, the algorithm suggests ways to employ sample statistics to evaluate the results, as explained below.
There are two major differences between SampEn and ApEn statistics. First, SampEn does not count selfmatches. We justified discounting selfmatches on the grounds that entropy is conceived as a measure of the rate of information production (5), and in this context comparing data with themselves is meaningless. Furthermore, selfmatches are explicitly dismissed in the later work of Grassberger and coworkers (2, 9, 11). Second, SampEn does not use a templatewise approach when estimating conditional probabilities. To be defined, SampEn requires only that one template find a match of length m + 1.
We began from the work of Grassberger and Procaccia (10), who defined C
^{m}(r) = (N − m + 1)^{−1}
In this form, however, the limits render it unsuitable for the analysis of finite time series with noise. We therefore made two alterations to adapt it to this purpose. First, we followed their later practice in calculating correlation integrals (2, 9, 11) and did not consider selfmatches when computingC ^{m}(r). Second, we considered only the first N − m vectors of length m, ensuring that, for 1 ≤ i ≤ N − m,x _{m}(i) andx _{m + 1}(i) were defined.
We defined
The quantity A/B is precisely the conditional probability that two sequences within a tolerance r form points remain within r of each other at the next point. In contrast to ApEn(m, r, N ), which calculates probabilities in a templatewise fashion, SampEn(m, r, N ) calculates the negative logarithm of a probability associated with the time series as a whole. SampEn(m, r, N ) will be defined except when B = 0, in which case no regularity has been detected, or when A = 0, which corresponds to a conditional probability of 0 and an infinite value of SampEn(m, r, N ). The lowest nonzero conditional probability that this algorithm can report is 2[(N − m − 1)(N − m)]^{−1}. Thus, the statistic SampEn(m, r, N ) has ln (N − m) + ln (N − m − 1) − ln (2) as an upper bound, nearly doubling ln (N − m), the dynamic range of ApEn(m, r, N ).
Confidence intervals inform the implementation of SampEn statistics.
SampEn is not defined unless template and forward matches occur and is not necessarily reliable for small numbers of matches. We have reviewed SampEn(m, r, N ) calculation as a process of sampling information about regularity in the time series and used sample statistics to inform us of the reliability of the calculated result. For example, say that we find B template matches, allowing for no more than B forward matches, A of which actually occur. We assign a value of 1 to the A forward matches and a value of 0 to the (B − A) potential forward matches that do not occur and compute the conditional probability measured by SampEn as the average of this sample of 0s and 1s. For operational purposes, we will assume that the sample averages follow a Student's t
_{d} distribution, where dis the number of degrees of freedom. We can then say with 95% confidence that the “true” average conditional probability of the process is within SDt
_{(B − 1,0.975)}/
The confidence intervals for SampEn are displayed as error bars in the figures. For some small values of N and r, no value of SampEn is given. This indicates that B = 0, A = 0, or the confidence intervals extended to a probability of >1 or <0. In these cases, no value of SampEn(m, r, N ) can be assigned with confidence.
CrossApEn and crossSampEn measure asynchrony.
CrossApEn is a recently introduced technique for comparing two different time series to assess their degree of asynchrony or dissimilarity (20, 28). The definition of crossApEn is very similar to ApEn. Given two time series of N points {u( j): 1 ≤ j ≤ N } and {v( j): 1 ≤ j ≤ N }, form the vectorsx
_{m}(i) = {u(i + k): 0 ≤ k ≤ m − 1} andy
_{m}(i) = {v(i + k): 0 ≤ k ≤ m − 1}. The distance between two such vectors is defined asd[x
_{m}(i),y
_{m}( j)] = max {‖u(i + k) − v( j + k)‖: 0 ≤ k ≤ m − 1}. Define
We made two observations. First, because no template is compared with itself, there are no selfmatches. Consequently,
CrossApEn is not always defined.
As noted above, crossApEn does not include selfmatching and, thus, does not inherently suffer from the same bias as ApEn. A potential problem, however, remains in the necessity for each template to generate a defined, nonzero conditional probability. Thus each template must find at least one match for m + 1 points, or a probability must be assigned to it. No guidelines have been suggested for handling this potential difficulty. CrossSampEn, on the other hand, requires only that one pair of vectors in the two series match for m + 1 points.
The family of MIX(P) stochastic processes (21) provided a testing ground for crossApEn. Informally, the MIX(P) time series of N points, where P is between 0 and 1, is a sine wave, where N × P randomly chosen points have been replaced with random noise. We calculated crossApEn(1, r, 250) for the pair [MIX(Q)∥MIX(P)] and its direction conjugate [MIX(P)∥MIX(Q)] for 16 realizations of each of the 6 combinations of P = 0.1, 0.2, 0.3 and Q = 0.5, 0.7 over a range of values of r from 0.01 to 1.0. CrossApEn(1, r, 250) [MIX(Q)∥MIX(P)] was not defined for any of the 96 pairs for r ≤ 0.16 and was defined for all of them only for r ≥ 0.50. CrossApEn (1, r, 250) [MIX(P)∥MIX(Q)] was not defined for any values of r ≤ 0.32 and was defined for all pairs only for r = 1.0.
To broaden the conditions for which crossApEn was defined, we introduced a correction factor into its algorithm. To avoid ln (0) whenever
The second approach, which we called bias max, also only modified the functions
The only difference between the two strategies is that bias max assigns to a template yielding no matches at all a probability of (N − m)^{−1}, the lowest nonzero probability allowed by the length of the time series. Thus bias 0 sets the bias toward a crossApEn value of 0 in the absence of any matches, whereas bias max sets the bias toward the highest observable value of crossApEn.
CrossApEn is direction dependent; crossSampEn is not.
Because of the logarithms inside the summation, Φ^{m}(r)(v∥u) will not generally be equal to Φ^{m}(r)(u∥v). Thus crossApEn(m, r, N )(v∥u) and its direction conjugate crossApEn(m, r, N )(u∥v) are unequal in most cases.
In defining crossSampEn, we set
We calculated crossSampEn(1, r, 250) for the same realizations of the [MIX(Q)∥MIX(P)] pairs used above to test crossApEn and over the same range of r. In contrast to crossApEn(1, r, 250) [MIX(P)∥MIX(Q)] and crossApEn(1, r, 250) [MIX(Q)∥MIX(P)], crossSampEn(1, r, 250) [MIX(P)∥MIX(Q)], which is identical to crossSampEn(1, r, 250) [MIX(Q)∥MIX(P)], was defined for all 96 pairs over the entire range of r considered.
ApEn and SampEn can be calculated analytically for series of random numbers.
ApEn and SampEn derive from formulas suggested to estimate the Kolmogorov entropy of a process represented by a time series. At their root, each is a measurement of the conditional probability that two vectors that are close to each other for m points will remain close at the next point. There are several models, including sets of iid random numbers, for which the theoretical values of the parameters ApEn(m, r)(21) and SampEn(m, r) can be calculated. We show here the case of uniform random numbers, for which the theoretical values of ApEn and SampEn are nearly identical.
The expected value of the key probability can be calculated analytically for series of iid numbers based only on their probabilistic distribution. The numbers' independence implies that the probability that two randomly selected sequences within rSD of each other for the first m points will remain within rat their next points is simply the probability that any two points will be within a distance rSD of each other. For random numbers with density function p(x) and standard deviation SD, the expression for this probability is
Because these expressions for the expected values of the parameters ApEn and SampEn depend solely on r and the probabilistic character of the data, uniform random numbers provide a benchmark for testing the estimation of ApEn over a range of parameters. In particular, ApEn and SampEn are expected to give identical results for uniformly distributed random numbers. Figure 1
A shows a histogram of the random numbers used and an excerpt from the sequence (inset). Taking the derivative of −
RESULTS AND DISCUSSION
SampEn agrees with theory more closely than ApEn.
For most processes, ApEn statistics are expected to have two properties. First, the conditional probability that sequences withinr of each other remain within r should decrease asr decreases and the criterion for matching becomes more stringent. In other words, ApEn(m, r, N ) should increase as r decreases (22, 27). This expected property is demonstrated in Fig. 1 B by the straight line, which plots the theoretically predicted values of ApEn and SampEn. Second, ApEn should be independent of record length. The plot of theoretical values of ApEn and SampEn in Fig. 1 B illustrates this expectation. Because of their similarity, we expect SampEn statistics to exhibit similar properties. It has been suggested that record lengths of 10^{m}–20^{m}should be sufficient to estimate Ap En(m, r) (27).
We tested ApEn and SampEn statistics on uniform, iid random numbers, because the results could be compared with the analytically calculated expected values. Figure 1, B and C, shows the performance of ApEn(2, r, N ) and SampEn(2, r, N ) on uniform iid random numbers. SampEn(2, r, N ) very closely matches the expected results for r ≥ 0.03 andN ≥ 100 (Fig. 1, B and C), whereas ApEn(2, r, N ) differs markedly from expectations for N < 1,000 and r < 0.2. Figure2 shows SampEn and ApEn as functions ofr (m = 2) for three sets of uniform random numbers consisting of 100, 5,000, and 20,000 points. SampEn statistics forr = 0.2 are in agreement with theory for much shorter data sets. We investigated the general applicability of SampEn and ApEn statistics to random numbers with other distributions. The analysis of numbers with Gaussian, exponential, and γdistributions with parameter λ = 1,2, … , 10 gave results essentially identical to those shown in Fig. 1.
SampEn shows relative consistency where ApEn does not.
A critically important expected feature of ApEn is relative consistency (22, 27). That is, it is expected that, for most processes, if ApEn(m _{1}, r _{1})(S) ≤ ApEn(m _{1}, r _{1})(T ), then ApEn(m _{2}, r _{2})(S), ≤ ApEn(m _{2}, r _{2})(T ). That is, if record S exhibits more regularity than recordT for one pair of parameters m and r, it is expected to do so for all other pairs. Graphically, plots of ApEn as a function of r for different data sets should not cross over one another. The determination that one set of data exhibits greater regularity than another can be made only when this condition is met. We tested this expectation using 1,000point realizations of the MIX(P) process, where the degree of order could be specified. Figure 3 A shows that the ApEn statistics of the lessordered MIX(0.9), which has, on average, few matches of length m for a given template (smallB _{i}) for small r, rises as a function of r and crosses over a plot of ApEn statistics of the moreordered MIX(0.1). For r < 0.05, one would conclude incorrectly that MIX(0.9) was more ordered than MIX(0.1). Thus relative consistency does not hold for ApEn statistics.
We investigated the mechanism responsible for this lack of relative consistency. Note that, for r = 0.5, ApEn statistics correctly distinguished MIX(0.1) and MIX(0.9). For this value of r, the MIX(0.1) data yielded an average of >46 matches per template and ∼28 forward matches per template, whereas the MIX(0.9) data yielded ∼37 and 10, respectively. These large numbers of matches render the bias insignificant; that is, the unbiasedA _{i}/B _{i} (28/46 and 10/37) is not very different from the biased (A _{i} + 1)/(B _{i} + 1) (29/47 and 11/38). As r is decreased, however, the number of template matches decreased more for the MIX(0.9) data than for the MIX(0.1) data. This made the bias significant for MIX(0.9) data under conditions for which it was insignificant in the MIX(0.1) data. For example, when r = 0.05, ApEn(2, r, 1,000) of MIX(0.1) was 0.463, whereas the value for MIX(0.9) was 0.505. The spurious similarity of the ApEn statistics is due to bias. The MIX(0.1) data yielded ∼27 matches per template and 22 forward matches, whereas the MIX(0.9) data yielded only 0.45 and 0.01, respectively. Thus more than onehalf of the templates of the MIX(0.9) data matched no other templates and were assigned a conditional probability of 1. In this example, ApEn statistics lack relative consistency, because lessordered data sets have fewer matching templates and are more vulnerable to the bias generated by selfmatches. SampEn analysis of the same data, on the other hand, reports correctly over the whole range of r (Fig. 3 B).
Are SampEn statistics relatively consistent?
We have shown that SampEn statistics appear to be relatively consistent over the family of MIX(P) processes, whereas ApEn statistics are not. Although we believe that relative consistency should be preserved for processes for which probabilistic character is understood, we see no general reason why ApEn or SampEn statistics should remain relatively consistent for all time series and all choices of parameters.
We propose a general, but by no means exhaustive, explanation for this phenomenon. SampEn is, in essence, an eventcounting statistic, where the events are instances of vectors being similar to one another. When these events are sparse, the statistics are expected to be unstable, which might lead to a lack of relative consistency. Recall that SampEn(m, r, N ) is less than or equal to ln (B), the natural logarithm of the number of template matches. Suppose SampEn(m, r, N )(S) < SampEn(m, r, N ) (T ) and that the number of T's template matches,B _{T}, is less than the number ofS's template matches, B _{S}, which would be consistent with T displaying less order thanS. Provided that A _{T} andA _{S}, the number of forward matches, are relatively large, both SampEn statistics will be considerably lower than their upper bounds. As r decreases,B _{T} and A _{T} are expected to decrease more rapidly than B _{S}and A _{S}. Thus, asB _{T} becomes very small, SampEn(m, r, N )(T ) will begin to decrease, approaching the value ln (B _{T}), and could cross over a graph of SampEn(m, r, N )(S), where or while B _{S} is still relatively large. Furthermore, as the number of template matches decreases, small changes in the number of forward matches can have a large effect on the observed conditional probability. Thus the discrete nature of the SampEn probability estimation could lead to small degrees of crossover and intermittent failure of relative consistency, and we cannot say that SampEn will always be relatively consistent. We have shown, however, that SampEn is relatively consistent for conditions where ApEn is not, and we have not observed any circumstance where ApEn maintains relative consistency and SampEn does not.
One source of the residual bias in SampEn is correlation of templates.
Although more consonant with theory than ApEn, we found that SampEn statistics deviated from predictions for very short data sets. For 10^{5} sets of Gaussian random numbers with m = 2, andr = 0.2, we found that the deviation was <3% for record lengths >100 but as high as 35% for sets of 15 points. Figure3 C shows the biased results of SampEn(2, 0.2, N ) for the range of 4 ≤ N ≤ 100. We suspected that the integral expressions for the parameters ApEn(m, r) and SampEn(m, r) could not be used as expected values of the statistics ApEn(m, r, N ) and SampEn(m, r, N ) under all conditions, because the expressions relied on the assumption that the templates were independent of one another. As N decreases and mincreases, however, a larger proportion of templates are comprised of overlapping segments of the record and are thus not independent. Because of this correlation, results might deviate from these predictions for short data sets. We thus tested the hypothesis that the majority of the bias results from nonindependence of the templates.
One way to test the hypothesis is to compare observed values of SampEn with those obtained from a model accounting for template correlation. The hypothesis predicts that the values should match. We tested the simplest case of m = 2 and N = 4, where a data set can be represented by {w, x, y, z}, so that there are exactly two vectors of length m = 3, (w, x, y) and (x, y, z). If (w, x) is close to (x, y), it stands to reason that (w, x, y) will have a higher than expected probability of being close to (x, y, z). Formally, the conditional probability that (w, x, y) and (x, y, z) will be close given (w, x) and (x, y) are close is [
A second way to test the hypothesis is to calculate SampEn statistics for one set of data under two conditions: with and without overlapping templates. The hypothesis predicts that the results should not match. For this test, we chose the case of m = 2 and N = 6, the shortest record containing two nonoverlapping templates of lengthm + 1 = 3. We can represent each set as {a, b, c, x, y, z}. For 10^{6} sets of six Gaussian random numbers, we calculated the conditional probability that (a, b, c) was within r of (x, y, z) given that their first two points were close, thus calculating the probability for pairs of disjoint templates. The result was 0.111, very close to the expected 0.112 for independent templates. For the same number sets, we then calculated the average value of SampEn(2, 0.2, 6), and the result was 0.094. Thus the two results do not match, in support of the hypothesis. We conclude from this analysis that the statistics SampEn(m, r, N ) are not completely unbiased under all conditions and that the bias of SampEn for very small data sets is largely due to nonindependence of templates.
One method for removing this bias would be to partition the time series {u _{j}‖1 ≤ j ≤ N } into the m + 1 sets of neighboring, disjoint vectors of length m + 1 X _{i} = {[u _{i + k(m + 1)},u _{i + k(m + 1) + 1}, … , u _{i + k(m + 1) + m}]: 0 ≤ k≤ [N − (m + i)]/(m + 1)}, where i, the initial point of the first template, ranges from 1 to m + 1. The conditional probability that vectors close form points remain close at the next point would be calculated for each of the sets of vectors X _{i} and then averaged. Because this calculation compares only disjoint templates, it will not suffer from the bias introduced by nonindependent templates. This truly unbiased approach has the potentially severe limitation of reducing the number of possible template matches and enlarging the confidence intervals about the SampEn estimate. Because this bias appears to be present only for very small N, the disjoint template approach does not appear necessary in usual practice.
CrossSampEn shows relative consistency where crossApEn does not.
As noted above, an essential feature of the measures of order is their relative consistency. That is, if one series is more ordered than another, it should have lower values of ApEn and SampEn for all conditions. We can extend this idea to crossApEn and crossSampEn; if a pair of series is more synchronous than another pair, it should have lower values of crossApEn and crossSampEn statistics for all conditions tested.
We tested the ability of crossApEn and crossSampEn to distinguish between MIX(0.1) and the lessordered MIX(0.6) processes. The strategy was to compare each with the intermediate MIX(0.3) process. The expected result was that the [MIX(0.3), MIX(0.1)] pair should appear more ordered than the [MIX(0.3), MIX(0.6)] pair, because MIX(0.1) is significantly more ordered than MIX(0.6). That is, crossApEn(2, r, 250) [MIX(0.3)∥MIX(0.1)] should be less than crossApEn(2, r, 250) [MIX(0.3)∥MIX(0.6)], and crossSampEn(2, r, 250) [MIX(0.3)∥MIX(0.1)] should be less than crossSampEn(2, r, 250) [MIX(0.3)∥MIX(0.6)].
We tested this prediction bidirectionally, that is, MIX(0.3) served as the template series for one analysis and as the target series for the next, and over a range of tolerances r by using the biasmax and bias0 strategies for ensuring that crossApEn was defined. The results are shown in Fig. 4. MIX(0.3) was the template series in Fig. 4, C and E, and the target series in Fig. 4, D and F.
The expected result is that crossApEn and crossSampEn should be less for the [MIX(0.3), MIX(0.1)] pair than for the [MIX(0.3), MIX(0.6)] pair. That is, the circles should always be below the squares. We found this to be true for only one of the four tests of crossApEn, the case of using MIX(0.3) as the target series with the biasmax correction strategy (Fig. 4 D). In the other cases, the order of results was reversed (Fig. 4 C) or crossed over (Fig. 4, E andF). As shown in Fig. 4 B, crossSampEn returned the expected results with a high degree of confidence across the range of tolerances r.
Thus crossApEn statistics fail as a means of judging the relative order of two time series by their similarity to a third series. In practice, however, crossApEn has been used differently: to determine the relative synchrony of two pairs of time series of clinical data from different patients. We thus tested crossApEn and crossSampEn on two sections of a long multivariate cardiovascular time series used in the 1991 Santa Fe competition for time series forecasting (35). The series consisted of concurrent measurements of a sleeping patient's heart rate (hr) and chest volume (cv). We compared the pairs (hr1,cv1) and (hr2,cv2) shown in Fig. 5 A,excerpted from the larger time series. Here, the expected result was not known beforehand, and our question was whether one pair consistently appeared more synchronous than the other. This is an extension of the expected relative consistency of ApEn discussed above.
Figure 5 shows the results for N = 250, m = 1, and a range of r. We set m = 1, in accordance with published practice for analyzing series of similar length (28), and the record length of N = 250 exceeds the 10^{m}–20^{m} points recommended for ApEn analysis (27). Time series are shown in Fig. 5 A. The question is whether the pair (hr1,cv1) has more joint synchrony than the pair (hr2,cv2). The expected result is that the circles should be consistently higher or lower than the squares for Fig. 5,D–F. This was not the case; conclusions about relative synchrony by use of crossApEn analysis depended on which series served as the template and on the correction strategy, and there was no consistent result. CrossSampEn, on the other hand, consistently reported that (hr1,cv1) had more joint synchrony than (hr2,cv2) (Fig.5 B).
For 32 sets of these data, we further examined the correction methods, calculating crossApEn(m, r, N ) (cv∥hr) and crossApEn(m, r, N ) (hr∥cv) for each set. For the relaxed condition ofr = 1.0, we found that crossApEn(1,1,250)(cv∥hr) was defined for only 19 of the 32 cases, whereas crossApEn(1,1,250)(hr∥cv) was defined for only 12 cases. For the more stringent case of r = 0.2, crossApEn(1,0.2,250)(cv∥hr) was never defined, whereas crossApEn(1,0.2,250) (hr∥cv) was defined for 2 of the 32 cases. By contrast, for all r ≥ 0.08, crossSampEn(1,r,250)(cv∥hr) was defined for each of the 32 pairs. Thus crossSampEn had a more consistent performance for evaluating these clinical cardiovascular data.
Summary.
We have developed and characterized SampEn, a new family of statistics measuring complexity and regularity of clinical and experimental time series data and compared it with ApEn, a similar family. We find that SampEn statistics 1) agree much better than ApEn statistics with theory for random numbers with known probabilistic character over a broad range of operating conditions, 2) maintain relative consistency where ApEn statistics do not, and 3) have residual bias for very short record lengths, in a large part because of nonindependence of templates. Furthermore, crossSampEn is a more consistent measure of joint synchrony of pairs of clinical cardiovascular time series. We attribute the difficulties of ApEn analysis to the practice of counting selfmatches and of crossApEn to the problem of unmatched templates resulting in undefined probabilities. The differences are that SampEn does not count templates as matching themselves and does not employ a templatewise strategy for calculating probabilities. SampEn statistics provide an improved evaluation of time series regularity and should be a useful tool in studies of the dynamics of human cardiovascular physiology.
Acknowledgments
We thank L. Pitt, D. Scollan, and Rizwanuddin for advice and Virginia's Center for Innovative Technology for support.
Footnotes

Address for reprint requests and other correspondence: J. R. Moorman, Box 6012, MR4 Bldg., UVAHSC, Charlottesville, VA 22908 (Email:rmoorman{at}virginia.edu).
 Copyright © 2000 the American Physiological Society