## Abstract

Background

Heart rate variability is the variation of the time interval between consecutive heartbeats. Entropy is a commonly used tool to describe the regularity of data sets. Entropy functions are defined using multiple parameters, the selection of which is controversial and depends on the intended purpose. This study describes the results of tests conducted to support parameter selection, towards the goal of enabling further biomarker discovery.

Methods

This study deals with approximate, sample, fuzzy, and fuzzy measure entropies. All data were obtained from PhysioNet, a free-access, on-line archive of physiological signals, and represent various medical conditions. Five tests were defined and conducted to examine the influence of: varying the threshold value r (as multiples of the sample standard deviation σ, or the entropy-maximizing rChon), the data length N, the weighting factors n for fuzzy and fuzzy measure entropies, and the thresholds r F and r L for fuzzy measure entropy. The results were tested for normality using Lilliefors' composite goodness-of-fit test. Consequently, the p-value was calculated with either a two sample t-test or a Wilcoxon rank sum test.

Results

The first test shows a cross-over of entropy values with regard to a change of r. Thus, a clear statement that a higher entropy corresponds to a high irregularity is not possible, but is rather an indicator of differences in regularity. N should be at least 200 data points for r = 0.2 σ and should even exceed a length of 1000 for r = rChon. The results for the weighting parameters n for the fuzzy membership function show different behavior when coupled with different r values, therefore the weighting parameters have been chosen independently for the different threshold values. The tests concerning r F and r L showed that there is no optimal choice, but r = r F = r L is reasonable with r = rChon or r = 0.2σ.

Conclusions

Some of the tests showed a dependency of the test significance on the data at hand. Nevertheless, as the medical conditions are unknown beforehand, compromises had to be made. Optimal parameter combinations are suggested for the methods considered. Yet, due to the high number of potential parameter combinations, further investigations of entropy for heart rate variability data will be necessary.

Heart rate variability is the variation of the time interval between consecutive heartbeats. Entropy is a commonly used tool to describe the regularity of data sets. Entropy functions are defined using multiple parameters, the selection of which is controversial and depends on the intended purpose. This study describes the results of tests conducted to support parameter selection, towards the goal of enabling further biomarker discovery.

Methods

This study deals with approximate, sample, fuzzy, and fuzzy measure entropies. All data were obtained from PhysioNet, a free-access, on-line archive of physiological signals, and represent various medical conditions. Five tests were defined and conducted to examine the influence of: varying the threshold value r (as multiples of the sample standard deviation σ, or the entropy-maximizing rChon), the data length N, the weighting factors n for fuzzy and fuzzy measure entropies, and the thresholds r F and r L for fuzzy measure entropy. The results were tested for normality using Lilliefors' composite goodness-of-fit test. Consequently, the p-value was calculated with either a two sample t-test or a Wilcoxon rank sum test.

Results

The first test shows a cross-over of entropy values with regard to a change of r. Thus, a clear statement that a higher entropy corresponds to a high irregularity is not possible, but is rather an indicator of differences in regularity. N should be at least 200 data points for r = 0.2 σ and should even exceed a length of 1000 for r = rChon. The results for the weighting parameters n for the fuzzy membership function show different behavior when coupled with different r values, therefore the weighting parameters have been chosen independently for the different threshold values. The tests concerning r F and r L showed that there is no optimal choice, but r = r F = r L is reasonable with r = rChon or r = 0.2σ.

Conclusions

Some of the tests showed a dependency of the test significance on the data at hand. Nevertheless, as the medical conditions are unknown beforehand, compromises had to be made. Optimal parameter combinations are suggested for the methods considered. Yet, due to the high number of potential parameter combinations, further investigations of entropy for heart rate variability data will be necessary.

Original language | English |
---|---|

Pages (from-to) | 1-11 |

Journal | BMC Bioinformatics |

Volume | 15 |

Issue number | S2 |

DOIs | |

Publication status | Published - 2014 |

## Keywords

- Entropy-based data mining
- Knowledge Discovery
- Health Informatics
- Parameter selection
- entropy

## ASJC Scopus subject areas

- Information Systems
- Statistics, Probability and Uncertainty

## Fields of Expertise

- Information, Communication & Computing

## Treatment code (Nähere Zuordnung)

- Basic - Fundamental (Grundlagenforschung)