## Abstract

Dynamic aspects of R-R intervals have often been analyzed by means of linear and nonlinear measures. The goal of this study was to analyze binary sequences, in which only the dynamic information is retained, by means of two different aspects of regularity. R-R interval sequences derived from 24-h electrocardiogram (ECG) recordings of 118 healthy subjects were converted to symbolic binary sequences that coded the beat-to-beat increase or decrease in the R-R interval. Shannon entropy was used to quantify the occurrence of short binary patterns (length*N* = 5) in binary sequences derived from 10-min intervals. The regularity of the short binary patterns was analyzed on the basis of approximate entropy (ApEn). ApEn had a linear dependence on mean R-R interval length, with increasing irregularity occurring at longer R-R interval length. Shannon entropy of the same sequences showed that the increase in irregularity is accompanied by a decrease in occurrence of some patterns. Taken together, these data indicate that irregular binary patterns are more probable when the mean R-R interval increases. The use of surrogate data confirmed a nonlinear component in the binary sequence. Analysis of two consecutive 24-h ECG recordings for each subject demonstrated good intraindividual reproducibility of the results. In conclusion, quantification of binary sequences derived from ECG recordings reveals properties that cannot be found using the full information of R-R interval sequences.

- heart period dynamics
- symbolic dynamics
- approximate entropy
- Shannon entropy
- nonlinear dynamics
- surrogate data

in recent years linear measures of heart rate variability (HRV) have been applied in a wide range of contexts, leading to a well-established diagnostic tool with more or less accepted standards (16, 17, 30). Today, HRV is applied not only in cardiac diseases but in diseases that generally affect the autonomic nervous system (ANS). However, the influence of the sympathetic and parasympathetic branch of the ANS on linear measures of HRV, as well as the independent prognostic value of these measures with respect to high-risk patients with cardiac diseases, is still a matter of investigation (6, 9, 12). On the other hand, assessing HRV with nonlinear measures may supply information different from that of linear measures with the promise of better risk stratification (13,32-34). However, in most cases it is difficult to interpret these complementary findings in one unifying picture. In this study we examine the dynamic properties of heart periods with the use of two different nonlinear approaches that can be regarded as two complementary aspects of dynamic properties. The results also shed new light on the interpretation of power spectral measures of HRV.

Different approaches lead to nonlinear measures of HRV. In nonlinear dynamics theory, the so-called state space is reconstructed from sequences of heartbeat periods that are generally defined as the time duration between successive R waves in the electrocardiogram (ECG), the R-R tachogram. In a second step, the state space and the dynamic behavior of the reconstructed dynamics can be quantified (e.g., with measures of dimension or Lyapunov exponents). For an overview, see Ref.10. Practically, the sequences of heart periods are short, noisy, and often nonstationary. Thus the application of nonlinear measures to ECG recordings may lead to spurious indications of chaos (3, 7). However, one may guardedly say that this approach has yielded evidence of nonlinearities. Indeed, powerful quantities for describing heart period dynamics and for stratification of high-risk patients are still lacking (17).

Another approach to nonlinear measures of HRV is the quantification of complexity from the point of view of information theory. To this end, the sequence of heart periods can be analyzed with the help of entropy measures such as Shannon entropy or renormalized entropy (11, 25). These are often used in conjunction with the concept of symbolic dynamics or coding theory, i.e., reducing the amount of information by transforming the original time series into a symbolic sequence with a small set of symbols (8). These measures proved to be useful in detection of patients at high risk for sudden cardiac death (34). Another entropy measure for quantification of regularity in a time series is the approximate entropy (ApEn) (18, 24). ApEn has the ability to detect subtle differences in heart period dynamics that cannot be observed with commonly used linear measures (14, 15). Recently, the evaluation of ApEn for R-R tachograms derived from 24-h ECG recordings led to the suggestion of phase transitions, in the notion of synergetics, between daytime and nighttime heart period dynamics (2). It has also been shown that changes in fetal heart period complexity during pregnancy can be documented using ApEn (31). Though approximate entropy has been introduced for symbolic dynamics (20), its application to symbolic dynamics derived from physiological data has not been performed yet.

The goal of this study was to examine binary sequences derived from Holter recordings of healthy subjects to determine their pure dynamic properties. To this end, a “dynamic” or differential symbolization was used (1). Such a transformation into binary sequences is of particular interest because this method extracts solely dynamic properties of the R-R series, disregarding all information influenced by the absolute values of the R-R intervals, e.g., mean R-R interval, R-R standard deviation, and other measures of R-R interval variability. ApEn was used as a nonlinear measure of irregularity of short binary sequences to quantify their dynamic properties. Shannon entropy quantifies regularity on a larger scale of the symbolic dynamics under consideration and thus helped to make the results more precise. It is still unknown whether binary coding preserves nonlinear properties of the original R-R tachogram. To test the hypothesis that the binary representation of R-R dynamics still contains some important nonlinear properties, we made use of surrogate data. To demonstrate the intraindividual reproducibility of the binary approximate entropy, two consecutive 24-h ECG recordings for each subject were analyzed.

## METHODS

#### Subjects.

The subjects for this study were drawn from a previous study in which 121 healthy subjects were included (5). Three subjects were excluded from this analysis because of missing data. Two consecutively recorded 24-h ECGs (*ECGs A* and *B*) were available for the remaining 118 subjects (age: 20–40 yr, mean ± SD: 27 ± 6 yr; 78 females). The 24-h ECGs were recorded with Oxford FD3 solid-state recorders (Oxford Instruments, Abingdon, UK) with simultaneous R wave detection and a maximum sampling rate of 1,024 Hz during the QRS complex. This permitted a maximum resolution of 1 ms for the detection of the R waves. An Oxford Excel ECG analyzer allowed visual inspection of the automatically detected R waves. Generally, the number of ectopic or unrecognized beats was small (<1%), and thus such beats were not replaced or inserted. For further analysis the R times were written to a binary data file that was exported to a personal computer for further analysis.

#### Construction of symbolic sequences.

For each 10-min interval in the 24-h ECG (maximum 144 intervals/recording), the times between subsequent R waves (R-R intervals or heart periods) formed the corresponding R-R tachogram. Transformation of each 10-min R-R tachogram into a binary sequence was done as follows (see Fig. 1): beat-to-beat differences R-R_{n}
_{+1}− R-R_{n} > 0, i.e., a decrease in heart rate, were set to a value of 1, and differences R-R_{n}
_{+1} − R-R_{n} ≤ 0, i.e., an increase in heart rate, were set to a value of 0. The binary sequences are quantified by estimation of two different entropies: ApEn and Shannon entropy. Each entropy reveals different aspects of the binary sequence under consideration: ApEn is a nonlinear measure of irregularity in a time series (24), whereas Shannon entropy quantifies the amount of information in a time series (28).

#### Approximate entropy.

The goal of ApEn is to quantify irregularity or fluctuations in a time series on the basis of Kolmogorov-Sinai entropy (21, 23). It quantifies dynamic aspects of the time series under consideration in a statistical manner. A short description of the formal implementation of ApEn follows (for further details, see Refs. 18 and 22).

Given a time series (e.g., R-R tachogram) with *N* data points*u*(1), *u*(2), … , *u*(*N*), sequences of vectors **x**(1), … ,**x**(*N* − *m* + 1) are formed by defining **x**(*i*) = [*u*(*i*),*u*(*i* + 1), … ,*u*(*i* + *m* − 1)]. The parameter *m*, the number of components in each vector, has to be fixed. In nonlinear dynamics theory this would be interpreted as an “*m*-dimensional state space reconstruction.” Next the distance d[**x**(*i*),**x**(*j*)] between two vectors**x**(*i*) and **x**(*j*) is defined by the maximum difference of all their scalar components as
The “correlation sum” of vector **x**(*i*) is
The parameter *r* acts like a filter value: within resolution *r*, the numerator counts the number of vectors that are approximately the same as a given reference vector**x**(*i*). The quantity C_{i}
^{m}(*r*) is called the correlation sum because it quantifies the summed (or global) correlation of vector **x**(*i*) with all other vectors.

Next, the mean logarithmic correlation sum of all vectors is defined as
and ApEn is represented as
ApEn(*m, r, N*)(*u*) measures the logarithmic frequency with which vectors with *m* components that are close (within resolution *r*) remain close when the number of vector components is increased by one. This is the key to a measure of irregularity: small values of ApEn indicate regularity, and large values imply substantial fluctuations or irregularity in a time series*u*.

This concept can also be applied to short binary sequences or other symbolic dynamics. To understand the notion of irregularity in binary sequences, consider the sequences 00000, 11111, 01010, and 10110. The first two sequences are easily identified as very regular sequences. In the third sequence, the 0's and 1's alternate, and thus it is suitable to call this sequence regular, too. Only the last sequence does not contain any symmetries or periodically recurring subsequences; in other words, this sequence is more irregular. This concept of irregularity for binary sequences can be quantified using ApEn.

Formally, if ApEn is applied to binary sequences consisting of 1's and 0's, the distance d[**x**(*i*),**x**(*j*)] will be either 0 or 1. Thus it only makes sense to set the resolution *r* < 1. To keep things as easy as possible, we restricted ourselves to*m* = 1. Next, the optimal length of binary sequences to be quantified with ApEn had to be found. As pointed out in Ref. 20, the evaluation of ApEn with *m* = 1 is based on the calculation of the frequencies of the subsequences {0, 1, 00, 01, 10, 11} in the binary sequence under consideration. In a random binary pattern, the longer the binary sequence, the higher the probability that the subsequences occur with almost the same frequency. This would always lead to approximately the same values of ApEn. Thus short binary patterns would be better suited to produce ApEn values that can be distinguished from one another. In this work, we analyzed very short binary sequences (*N* = 5), permitting a good differentiation of the values of ApEn for the distinct binary patterns. We referred to these very short sequences as “binary patterns,” distinguishing them from the 10-min “binary sequences” of heart period dynamics.

To distinguish this use of approximate entropy from the normal use, we called this quantity “binary approximate entropy” (BinApEn). Practically, BinApEn was evaluated for each binary pattern of length*N* = 5 in the whole binary sequence generated from the 10-min R-R tachogram. The average of all BinApEn values was used to quantify heart period irregularity of the binary patterns.

#### Shannon entropy.

In contrast to BinApEn, Shannon entropy considers the whole binary sequence generated from the 10-min R-R tachogram. Shannon entropy gives a number that characterizes the probability that different binary patterns of length *N* occur. For a very regular binary sequence, only few distinct patterns occur. Thus Shannon entropy would be small because the probability for these patterns is high and only little information is contained in the whole sequence. For a random binary sequence, all possible patterns of length *N* occur with the same probability, and the content of information is maximal. This case is indicated by maximal values of Shannon entropy.

To formalize this concept, first the probabilities of each pattern of length *N* are estimated from the whole binary sequence (28)
where
is the number of occurrences of the pattern *s*
_{1},*s*
_{2}, … , *s _{N}
* and

*n*

_{tot}is the total number of patterns. Next, the entropy estimation

*S*(

*N*) is defined as For a better comparison when using different pattern lengths

*N*,

*S(N)*is divided by

*N*. Thus the maximal estimation of Shannon entropy is always 1. The properties of this measure are as follows. If only one binary pattern occurs in the whole sequence,

*S*(

*N*) = 0. If all 2

^{N}patterns are equally distributed in the sequence, i.e., the probability is

*p̂*= ½

^{N}for all patterns, and then

*S*(

*N*) = 1. This means that all

*N*bits are needed to describe the whole binary sequence properly.

According to the pattern length of the BinApEn algorithm, a length (i.e., embedding dimension) of *N* = 5 symbols for the subsequences is used. Keeping in mind that each 10-min interval contains ∼800 heartbeats, this guarantees a proper estimation of the probabilities of all 2^{5} = 32 binary subsequences. Deviations from identical distribution of all binary patterns are observed more easily than for shorter or longer pattern lengths. This entropy estimation is referred to as BinShan.

#### Surrogate data.

The properties of binary sequences generated from heart period dynamics are still unknown. It is not known whether nonlinear properties can be found in such binary sequences or whether these can be fully described with the help of linear methods. In other words, does the sequence of acceleration and deceleration of heart periods already contain nonlinearities, or is the nonlinear information only revealed if the absolute R-R intervals are taken into account? To answer this question, we used an iterative scheme introduced by Schreiber and Schmitz (27) to produce surrogate data. At the moment, this method seems to be the best choice of all randomization techniques, preserving almost all linear properties of the original time series with relatively low computational costs. In contrast to other techniques, the iterative scheme not only retains the mean and the standard deviation (i.e., the distribution) but also maintains the power spectrum (i.e., the autocorrelations) of the original time series (relative error < 0.1%). All other properties are randomized. Thus the surrogate data cannot be distinguished from original data with any linear measure of HRV.

In this study surrogate data were constructed for each 10-min interval of all 24-h ECGs, and in a second step the binary sequences were generated as described above. If the binary sequences derived from original data contain nonlinear properties, the estimation of BinApEn and BinShan should reveal differences between the original and surrogate data.

#### Statistics.

Dependencies between two variables were quantified by Pearson's correlation coefficient (referred to as *R* to distinguish it from the parameter *r*). The dependence between mean BinApEn versus mean R-R and Shannon entropy versus mean R-R was quantified by the linear regression *y* = *a* ·*x* + *b*. To test the hypothesis that nonlinear components are still observable in the binary sequences, the distribution of differences between original and surrogate slopes and correlation coefficients was used. The probability of rejecting the null hypothesis that no difference is observable was calculated with Student's *t*-test, and *P* < 0.05 was considered statistically significant.

## RESULTS

#### Approximate entropy.

The results for BinApEn of all 236 24-h ECGs were examined visually by plotting mean BinApEn against mean R-R interval of each 10-min interval. Figure 2
*A*shows an example. A linear dependency between mean BinApEn and mean R-R interval is observable: the longer the R-R interval, the higher the mean BinApEn and, hence, the more irregular the binary patterns. The correlation between mean BinApEn and mean R-R interval yielded*R* = 0.84. Generally, we found this dependence in all 24-h ECGs. In Figs. 3
*A* and4
*A* the distributions of slopes and correlation coefficients of all ECGs are shown. The distribution of correlation coefficients has a mean of*R* = 0.78, showing strong correlation between mean BinApEn and mean R-R interval in all ECGs. Thus a proper evaluation of linear regression was guaranteed. The distribution of the slopes yielded a mean slope of *a* = 4.22 × 10^{−1} s^{−1}.

Next, we evaluated BinApEn for the surrogate data in a similar fashion. At first glance, the slope of the linear dependence in Fig.2
*B* is less steep; *a* and *R* are smaller than those of the original data. However, the distribution of correlation coefficients as depicted in Fig. 4
*B* shows that the mean coefficient (*R* = 0.73) of the surrogate data is only slightly lower than that of the original data. The distribution of paired differences of correlation coefficients between original and surrogate data has a mean of 0.05 (*P* < 0.0001). Thus surrogate data showed a linear dependence to a slightly lesser extent, but it is still feasible to evaluate linear regression slopes. On the other hand, the distribution of slopes of all surrogate data as shown in Fig. 3
*B* revealed a clear reduction of the mean slope (*a* = 2.88 × 10^{−1} s^{−1}). The distribution of paired differences of slopes between the original and surrogate data has its mean at 1.35 × 10^{−1}s^{−1}, showing a clear deviation from zero mean (*P* < 0.0001).

We point out that the evaluation of the linear regression depends on the correlation between mean BinApEn and mean R-R interval. Consequently, the decrease of the slope of the linear regression for the surrogate data is partly due to a decrease in the correlation between mean BinApEn and mean R-R interval.

#### Shannon entropy.

An example of BinShan of 10-min intervals plotted against mean R-R interval is depicted in Fig. 5
*A*(data are from same subject as shown in Fig. 2). Overall, in all ECGs, as mean R-R interval increased, BinShan decreased. This implies that a shorter mean R-R interval could be associated with more equally distributed binary patterns. The distribution of slopes yielded a mean of *a* = −2.32 × 10^{−1}s^{−1} (Fig. 6
*A*). The mean value of *R* (Fig.7
*A*, *R* = −0.56) guaranteed a proper evaluation of linear regression.

For the surrogate data, values of BinShan are generally increased as shown in Fig. 5
*B*. Thus a less marked difference between short and long mean R-R interval was observable, and hence,*R* is reduced (Fig. 7
*B*, mean *R* = −0.42). The distribution of slopes was shifted to higher values (Fig.6
*B*, mean *a* = −1.04 × 10^{−1} s^{−1}). The distribution of paired differences of slopes showed a clear deviation from zero mean (mean*a* = −1.28 × 10^{−1} s^{−1},*P* < 0.0001).

#### Reproducibility of BinApEn and BinShan.

Two consecutive 24-h ECGs were available for each subject. The slopes of linear regression of each subject were used to estimate the reproducibility. The slopes of *ECG A* were plotted against those of *ECG B* (Fig. 8). Both entropies yielded strong correlation between the slopes of both days (BinApEn: *R* = 0.78; BinShan: *R* = 0.85). This implies a good intraindividual reproducibility of BinApEn and BinShan. Because the slopes showed a broad distribution, this result may also imply that each subject has its specific slope of linear regression.

## DISCUSSION

We used binary sequences derived from R-R tachograms of 24-h ECG recordings that retain only basic dynamic aspects of the R-R tachogram, i.e., the acceleration (set to 0) and deceleration (set to 1) of heartbeat, to estimate approximate and Shannon entropy. This kind of dynamic symbolization allowed the examination of stationary as well as many nonstationary segments because the symbolization of differences between R-R intervals eliminates nonstationarities resulting from a minor bias underlying the R-R tachogram. We did not calculate entropy estimations using a static symbolization (e.g., all R-R intervals above the level of the mean R-R interval were set to 1, and the others were set to 0). In the literature this kind of transformation is used to detect so-called “forbidden words,” i.e., patterns in successive R-R intervals, that might be of interest in certain cardiac diseases (11, 32-34). In the context of entropy estimations established in this study, the latter transformation is not useful because it often yields long chains of 1's or 0's in nonstationary sequences, resulting in minimal entropy estimations for BinApEn and BinShan that might be interpreted spuriously.

The evaluation of mean BinApEn of each 10-min interval exhibited two properties: mean BinApEn strongly correlated with mean R-R interval and was very reproducible for each subject. Mean BinApEn demonstrated that short binary patterns were most regular at short R-R intervals and displayed more irregularity with increasing R-R intervals. BinShan was maximal for shorter R-R intervals, indicating that all binary patterns occur with almost the same probability, and was minimal for longer R-R intervals, exhibiting predominance of certain binary patterns that may result from phase locking with the respiratory rhythm (see below).

We point out that BinApEn and BinShan deal with two different notions of regularity. BinApEn quantifies the regularity of short binary patterns, whereas BinShan quantifies the regularity of the occurrence of the binary patterns. Thus the two notions complement each other.

Considering only Shannon entropy would lead to the conclusion that the general behavior of heart period dynamics seems to be more regular at longer R-R intervals in the sense that certain binary patterns predominantly occur, whereas other patterns tend to disappear. On the other hand, the results of BinApEn indicate that for long R-R intervals the binary patterns in heart period dynamics were those with highest irregularity. Combining these findings, we can conclude that although fewer distinct patterns occurred at longer R-R intervals, these patterns were precisely those reflecting greater irregularity. In other words, at longer R-R intervals irregular patterns of heart period dynamics appeared more regularly.

Although we did not differentiate between daytime and nighttime (or sleep stages), we noted that long R-R intervals are likely to appear at night, whereas short ones appear during the day. This is shown in Fig.2, in which two distinct regions are separated at a mean R-R interval of ∼0.85 s. This leads to the conclusion that at night, fewer distinct dynamic patterns of the R-R intervals occur more regularly, but the dynamics of these patterns are more irregular than during the day.

This finding fills the gap between the findings of two former studies conducted in our laboratory. Using the full information of R-R interval lengths for the evaluation of ApEn, we were able to demonstrate that heart period dynamics are more irregular at night than during the day and that the change from day to night or vice versa is probably accompanied by a phase transition in the notion of synergetics (i.e., no linear dependence on mean R-R interval length) (2). In a recent study, we emphasized that at night, cardiac dynamics reveal a predominance of binary patterns that can be assigned to distinct frequency ratios or even phase locking with the respiratory rhythm (e.g., 4:1, 7:2, 5:1) (1). For example, if 5:1 phase locking is present, the binary pattern 11001 must occur predominantly and cyclically recurrent. This predominance was interpreted as an increase of heart period regularity and an augmentation of musical rhythmicity in cardiac dynamics. In the present analysis, this pattern was identified as one of the most irregular patterns, i.e., with the highest value of BinApEn (20), leading to high values of mean BinApEn. Thus the predominance of binary patterns that results from frequency or phase locking ratios may still lead to strong irregularities within the binary patterns. We point out that synchronization in physiological systems is most often an intermittent phenomenon, detectable during short periods of time with changing locking ratios (26, 29). A further distinction of irregularities between synchronized and nonsynchronized sequences has yet to be established.

The use of surrogate data resulted in a reduction of the slopes of the linear regression between mean BinApEn and mean R-R intervals. For short R-R intervals mean BinApEn slightly increased, and for long R-R intervals mean BinApEn slightly decreased. The values of mean BinApEn of binary sequences generated from completely random sequences (independent identical distribution) tend toward a value of ∼0.37. (Note that by construction, purely random sequences are not maximally irregular in the sense of BinApEn; see e.g., Ref 19.) This indicates that the randomization procedure destroyed some inherent nonlinear properties because the values of mean BinApEn tended toward the stated value even though almost all linear properties were kept constant. The results for BinShan of the surrogate data can be interpreted in a similar fashion. In conclusion, the dynamic properties under consideration cannot solely be described with linear methods but also show evidence of nonlinearities. Moreover, even binary sequences contain nonlinear properties that cannot be described with measures of HRV derived from linear time series analysis.

By focusing on the beat-to-beat acceleration and deceleration of heart periods, only fast-modulating rhythms in heart period dynamics are captured, i.e., changes in heart periods due to respiratory sinus arrhythmia (RSA) and other parasympathetic activity. The effects of slower rhythms that influence the heart periods, e.g., the blood pressure or slower variations, can be neglected because they only give rise to a bias underlying the fastest modulation. These modulations only affect the symbolization scheme if the bias exceeds the modulations of the RSA. Hence, our results are primarily attributed to the vagal activity on the cardiac system. It is well known that the vagal influence shows a circadian pattern with an increasing strength at night (4). This is in accordance with the aforementioned binary pattern types that occur predominantly at longer R-R intervals and may indicate frequency or phase locking between heartbeat and respiration but that reveal at least certain frequency ratios between these two interacting systems. Keeping our results in mind, the interpretation of an HRV power spectrum can be extended. On one hand, a pronounced modulation of heart periods by RSA causes high power in the respiratory frequency band. This implies that the heart periods are modulated more regularly. On the other hand, the same modulation may result in more irregular patterns of heart period dynamics, attributing to an increase of complexity.

Moreover, the entropies of binary heart period dynamics turned out to be highly reproducible for each subject. This fact supports the findings that each healthy individual maintains the dynamic properties of the heart periods over at least two days (1). Further investigations may show how these properties depend on age and are affected by cardiovascular and autonomic diseases.

In conclusion, the findings of this study have demonstrated that the binary symbolization of R-R interval dynamics, which at first glance seems to be an enormous waste of information, gives an important key to a better understanding of normal heart period regularity. Furthermore, differential binary symbolization still enables the identification of nonlinear dynamical properties.

## Acknowledgments

We acknowledge financial support from Weleda, Schwäbisch Gmünd, Germany (to H. Bettermann and D. Cysarz).

## Footnotes

Address for reprint requests and other correspondence: D. Cysarz, Dept. of Clinical Research, Gemeinschaftskrankenhaus Herdecke, Gerhard-Kienle-Weg 4, D-58313 Herdecke, Germany (E-mail:d.cysarz{at}rhythmen.de).

The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “

*advertisement*” in accordance with 18 U.S.C. §1734 solely to indicate this fact.

- Copyright © 2000 the American Physiological Society