### Abstract

Heart rate variability is the variation of the time interval between consecutive heartbeats. Entropy is a commonly used tool to describe the regularity of data sets. Entropy functions are defined using multiple parameters, the selection of which is controversial and depends on the intended purpose. This study describes the results of tests conducted to support parameter selection, towards the goal of enabling further biomarker discovery.

Methods

This study deals with approximate, sample, fuzzy, and fuzzy measure entropies. All data were obtained from PhysioNet, a free-access, on-line archive of physiological signals, and represent various medical conditions. Five tests were defined and conducted to examine the influence of: varying the threshold value r (as multiples of the sample standard deviation σ, or the entropy-maximizing rChon), the data length N, the weighting factors n for fuzzy and fuzzy measure entropies, and the thresholds r F and r L for fuzzy measure entropy. The results were tested for normality using Lilliefors' composite goodness-of-fit test. Consequently, the p-value was calculated with either a two sample t-test or a Wilcoxon rank sum test.

Results

The first test shows a cross-over of entropy values with regard to a change of r. Thus, a clear statement that a higher entropy corresponds to a high irregularity is not possible, but is rather an indicator of differences in regularity. N should be at least 200 data points for r = 0.2 σ and should even exceed a length of 1000 for r = rChon. The results for the weighting parameters n for the fuzzy membership function show different behavior when coupled with different r values, therefore the weighting parameters have been chosen independently for the different threshold values. The tests concerning r F and r L showed that there is no optimal choice, but r = r F = r L is reasonable with r = rChon or r = 0.2σ.

Conclusions

Some of the tests showed a dependency of the test significance on the data at hand. Nevertheless, as the medical conditions are unknown beforehand, compromises had to be made. Optimal parameter combinations are suggested for the methods considered. Yet, due to the high number of potential parameter combinations, further investigations of entropy for heart rate variability data will be necessary.

Language | English |
---|---|

Pages | 1-11 |

Journal | BMC Bioinformatics |

Volume | 15 |

Issue number | S2 |

DOIs | |

Status | Published - 2014 |

### Fingerprint

### Keywords

- Entropy-based data mining
- Knowledge Discovery
- Health Informatics
- Parameter selection
- entropy

### ASJC Scopus subject areas

- Information Systems
- Statistics, Probability and Uncertainty

### Fields of Expertise

- Information, Communication & Computing

### Treatment code (Nähere Zuordnung)

- Basic - Fundamental (Grundlagenforschung)

### Cite this

*BMC Bioinformatics*,

*15*(S2), 1-11. DOI: 10.1186/1471-2105-15-S6-S2

**Selection of entropy-measure parameters for knowledge discovery in heart rate variability data.** / Mayer, Christopher; Bachler, Martin; Hörtenhuber, Matthias; Stocker, Christof; Holzinger, Andreas; Wassertheurer, Sigi.

Research output: Contribution to journal › Article › Research › peer-review

*BMC Bioinformatics*, vol. 15, no. S2, pp. 1-11. DOI: 10.1186/1471-2105-15-S6-S2

}

TY - JOUR

T1 - Selection of entropy-measure parameters for knowledge discovery in heart rate variability data

AU - Mayer,Christopher

AU - Bachler,Martin

AU - Hörtenhuber,Matthias

AU - Stocker,Christof

AU - Holzinger,Andreas

AU - Wassertheurer,Sigi

PY - 2014

Y1 - 2014

N2 - BackgroundHeart rate variability is the variation of the time interval between consecutive heartbeats. Entropy is a commonly used tool to describe the regularity of data sets. Entropy functions are defined using multiple parameters, the selection of which is controversial and depends on the intended purpose. This study describes the results of tests conducted to support parameter selection, towards the goal of enabling further biomarker discovery.MethodsThis study deals with approximate, sample, fuzzy, and fuzzy measure entropies. All data were obtained from PhysioNet, a free-access, on-line archive of physiological signals, and represent various medical conditions. Five tests were defined and conducted to examine the influence of: varying the threshold value r (as multiples of the sample standard deviation σ, or the entropy-maximizing rChon), the data length N, the weighting factors n for fuzzy and fuzzy measure entropies, and the thresholds r F and r L for fuzzy measure entropy. The results were tested for normality using Lilliefors' composite goodness-of-fit test. Consequently, the p-value was calculated with either a two sample t-test or a Wilcoxon rank sum test.ResultsThe first test shows a cross-over of entropy values with regard to a change of r. Thus, a clear statement that a higher entropy corresponds to a high irregularity is not possible, but is rather an indicator of differences in regularity. N should be at least 200 data points for r = 0.2 σ and should even exceed a length of 1000 for r = rChon. The results for the weighting parameters n for the fuzzy membership function show different behavior when coupled with different r values, therefore the weighting parameters have been chosen independently for the different threshold values. The tests concerning r F and r L showed that there is no optimal choice, but r = r F = r L is reasonable with r = rChon or r = 0.2σ.ConclusionsSome of the tests showed a dependency of the test significance on the data at hand. Nevertheless, as the medical conditions are unknown beforehand, compromises had to be made. Optimal parameter combinations are suggested for the methods considered. Yet, due to the high number of potential parameter combinations, further investigations of entropy for heart rate variability data will be necessary.

AB - BackgroundHeart rate variability is the variation of the time interval between consecutive heartbeats. Entropy is a commonly used tool to describe the regularity of data sets. Entropy functions are defined using multiple parameters, the selection of which is controversial and depends on the intended purpose. This study describes the results of tests conducted to support parameter selection, towards the goal of enabling further biomarker discovery.MethodsThis study deals with approximate, sample, fuzzy, and fuzzy measure entropies. All data were obtained from PhysioNet, a free-access, on-line archive of physiological signals, and represent various medical conditions. Five tests were defined and conducted to examine the influence of: varying the threshold value r (as multiples of the sample standard deviation σ, or the entropy-maximizing rChon), the data length N, the weighting factors n for fuzzy and fuzzy measure entropies, and the thresholds r F and r L for fuzzy measure entropy. The results were tested for normality using Lilliefors' composite goodness-of-fit test. Consequently, the p-value was calculated with either a two sample t-test or a Wilcoxon rank sum test.ResultsThe first test shows a cross-over of entropy values with regard to a change of r. Thus, a clear statement that a higher entropy corresponds to a high irregularity is not possible, but is rather an indicator of differences in regularity. N should be at least 200 data points for r = 0.2 σ and should even exceed a length of 1000 for r = rChon. The results for the weighting parameters n for the fuzzy membership function show different behavior when coupled with different r values, therefore the weighting parameters have been chosen independently for the different threshold values. The tests concerning r F and r L showed that there is no optimal choice, but r = r F = r L is reasonable with r = rChon or r = 0.2σ.ConclusionsSome of the tests showed a dependency of the test significance on the data at hand. Nevertheless, as the medical conditions are unknown beforehand, compromises had to be made. Optimal parameter combinations are suggested for the methods considered. Yet, due to the high number of potential parameter combinations, further investigations of entropy for heart rate variability data will be necessary.

KW - Entropy-based data mining

KW - Knowledge Discovery

KW - Health Informatics

KW - Parameter selection

KW - entropy

U2 - 10.1186/1471-2105-15-S6-S2

DO - 10.1186/1471-2105-15-S6-S2

M3 - Article

VL - 15

SP - 1

EP - 11

JO - BMC Bioinformatics

T2 - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

IS - S2

ER -