Selection of entropy-measure parameters for knowledge discovery in heart rate variability data

Christopher Mayer, Martin Bachler, Matthias Hörtenhuber, Christof Stocker, Andreas Holzinger, Sigi Wassertheurer

Research output: Contribution to journalArticle

Abstract

Background

Heart rate variability is the variation of the time interval between consecutive heartbeats. Entropy is a commonly used tool to describe the regularity of data sets. Entropy functions are defined using multiple parameters, the selection of which is controversial and depends on the intended purpose. This study describes the results of tests conducted to support parameter selection, towards the goal of enabling further biomarker discovery.

Methods

This study deals with approximate, sample, fuzzy, and fuzzy measure entropies. All data were obtained from PhysioNet, a free-access, on-line archive of physiological signals, and represent various medical conditions. Five tests were defined and conducted to examine the influence of: varying the threshold value r (as multiples of the sample standard deviation σ, or the entropy-maximizing rChon), the data length N, the weighting factors n for fuzzy and fuzzy measure entropies, and the thresholds r F and r L for fuzzy measure entropy. The results were tested for normality using Lilliefors' composite goodness-of-fit test. Consequently, the p-value was calculated with either a two sample t-test or a Wilcoxon rank sum test.

Results

The first test shows a cross-over of entropy values with regard to a change of r. Thus, a clear statement that a higher entropy corresponds to a high irregularity is not possible, but is rather an indicator of differences in regularity. N should be at least 200 data points for r = 0.2 σ and should even exceed a length of 1000 for r = rChon. The results for the weighting parameters n for the fuzzy membership function show different behavior when coupled with different r values, therefore the weighting parameters have been chosen independently for the different threshold values. The tests concerning r F and r L showed that there is no optimal choice, but r = r F = r L is reasonable with r = rChon or r = 0.2σ.

Conclusions

Some of the tests showed a dependency of the test significance on the data at hand. Nevertheless, as the medical conditions are unknown beforehand, compromises had to be made. Optimal parameter combinations are suggested for the methods considered. Yet, due to the high number of potential parameter combinations, further investigations of entropy for heart rate variability data will be necessary.
LanguageEnglish
Pages1-11
JournalBMC Bioinformatics
Volume15
Issue numberS2
DOIs
StatusPublished - 2014

Fingerprint

Heart Rate Variability
Entropy
Knowledge Discovery
Data mining
Heart Rate
Fuzzy Measure
Weighting
Threshold Value
Nonparametric Statistics
Regularity
Wilcoxon rank-sum test
Two-sample Test
Fuzzy Membership Function
Significance Test
Entropy Function
Knowledge discovery
Heart rate variability
Parameter Selection
t-test
Biomarkers

Keywords

  • Entropy-based data mining
  • Knowledge Discovery
  • Health Informatics
  • Parameter selection
  • entropy

ASJC Scopus subject areas

  • Information Systems
  • Statistics, Probability and Uncertainty

Fields of Expertise

  • Information, Communication & Computing

Treatment code (Nähere Zuordnung)

  • Basic - Fundamental (Grundlagenforschung)

Cite this

Mayer, C., Bachler, M., Hörtenhuber, M., Stocker, C., Holzinger, A., & Wassertheurer, S. (2014). Selection of entropy-measure parameters for knowledge discovery in heart rate variability data. BMC Bioinformatics , 15(S2), 1-11. DOI: 10.1186/1471-2105-15-S6-S2

Selection of entropy-measure parameters for knowledge discovery in heart rate variability data. / Mayer, Christopher; Bachler, Martin; Hörtenhuber, Matthias; Stocker, Christof; Holzinger, Andreas; Wassertheurer, Sigi.

In: BMC Bioinformatics , Vol. 15, No. S2, 2014, p. 1-11.

Research output: Contribution to journalArticle

Mayer, C, Bachler, M, Hörtenhuber, M, Stocker, C, Holzinger, A & Wassertheurer, S 2014, 'Selection of entropy-measure parameters for knowledge discovery in heart rate variability data' BMC Bioinformatics , vol 15, no. S2, pp. 1-11. DOI: 10.1186/1471-2105-15-S6-S2
Mayer C, Bachler M, Hörtenhuber M, Stocker C, Holzinger A, Wassertheurer S. Selection of entropy-measure parameters for knowledge discovery in heart rate variability data. BMC Bioinformatics . 2014;15(S2):1-11. Available from, DOI: 10.1186/1471-2105-15-S6-S2
Mayer, Christopher ; Bachler, Martin ; Hörtenhuber, Matthias ; Stocker, Christof ; Holzinger, Andreas ; Wassertheurer, Sigi. / Selection of entropy-measure parameters for knowledge discovery in heart rate variability data. In: BMC Bioinformatics . 2014 ; Vol. 15, No. S2. pp. 1-11
@article{b00f2f5ea17041a4b302c1b163552e35,
title = "Selection of entropy-measure parameters for knowledge discovery in heart rate variability data",
abstract = "BackgroundHeart rate variability is the variation of the time interval between consecutive heartbeats. Entropy is a commonly used tool to describe the regularity of data sets. Entropy functions are defined using multiple parameters, the selection of which is controversial and depends on the intended purpose. This study describes the results of tests conducted to support parameter selection, towards the goal of enabling further biomarker discovery.MethodsThis study deals with approximate, sample, fuzzy, and fuzzy measure entropies. All data were obtained from PhysioNet, a free-access, on-line archive of physiological signals, and represent various medical conditions. Five tests were defined and conducted to examine the influence of: varying the threshold value r (as multiples of the sample standard deviation σ, or the entropy-maximizing rChon), the data length N, the weighting factors n for fuzzy and fuzzy measure entropies, and the thresholds r F and r L for fuzzy measure entropy. The results were tested for normality using Lilliefors' composite goodness-of-fit test. Consequently, the p-value was calculated with either a two sample t-test or a Wilcoxon rank sum test.ResultsThe first test shows a cross-over of entropy values with regard to a change of r. Thus, a clear statement that a higher entropy corresponds to a high irregularity is not possible, but is rather an indicator of differences in regularity. N should be at least 200 data points for r = 0.2 σ and should even exceed a length of 1000 for r = rChon. The results for the weighting parameters n for the fuzzy membership function show different behavior when coupled with different r values, therefore the weighting parameters have been chosen independently for the different threshold values. The tests concerning r F and r L showed that there is no optimal choice, but r = r F = r L is reasonable with r = rChon or r = 0.2σ.ConclusionsSome of the tests showed a dependency of the test significance on the data at hand. Nevertheless, as the medical conditions are unknown beforehand, compromises had to be made. Optimal parameter combinations are suggested for the methods considered. Yet, due to the high number of potential parameter combinations, further investigations of entropy for heart rate variability data will be necessary.",
keywords = "Entropy-based data mining, Knowledge Discovery, Health Informatics, Parameter selection, entropy",
author = "Christopher Mayer and Martin Bachler and Matthias H{\"o}rtenhuber and Christof Stocker and Andreas Holzinger and Sigi Wassertheurer",
year = "2014",
doi = "10.1186/1471-2105-15-S6-S2",
language = "English",
volume = "15",
pages = "1--11",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",
number = "S2",

}

TY - JOUR

T1 - Selection of entropy-measure parameters for knowledge discovery in heart rate variability data

AU - Mayer,Christopher

AU - Bachler,Martin

AU - Hörtenhuber,Matthias

AU - Stocker,Christof

AU - Holzinger,Andreas

AU - Wassertheurer,Sigi

PY - 2014

Y1 - 2014

N2 - BackgroundHeart rate variability is the variation of the time interval between consecutive heartbeats. Entropy is a commonly used tool to describe the regularity of data sets. Entropy functions are defined using multiple parameters, the selection of which is controversial and depends on the intended purpose. This study describes the results of tests conducted to support parameter selection, towards the goal of enabling further biomarker discovery.MethodsThis study deals with approximate, sample, fuzzy, and fuzzy measure entropies. All data were obtained from PhysioNet, a free-access, on-line archive of physiological signals, and represent various medical conditions. Five tests were defined and conducted to examine the influence of: varying the threshold value r (as multiples of the sample standard deviation σ, or the entropy-maximizing rChon), the data length N, the weighting factors n for fuzzy and fuzzy measure entropies, and the thresholds r F and r L for fuzzy measure entropy. The results were tested for normality using Lilliefors' composite goodness-of-fit test. Consequently, the p-value was calculated with either a two sample t-test or a Wilcoxon rank sum test.ResultsThe first test shows a cross-over of entropy values with regard to a change of r. Thus, a clear statement that a higher entropy corresponds to a high irregularity is not possible, but is rather an indicator of differences in regularity. N should be at least 200 data points for r = 0.2 σ and should even exceed a length of 1000 for r = rChon. The results for the weighting parameters n for the fuzzy membership function show different behavior when coupled with different r values, therefore the weighting parameters have been chosen independently for the different threshold values. The tests concerning r F and r L showed that there is no optimal choice, but r = r F = r L is reasonable with r = rChon or r = 0.2σ.ConclusionsSome of the tests showed a dependency of the test significance on the data at hand. Nevertheless, as the medical conditions are unknown beforehand, compromises had to be made. Optimal parameter combinations are suggested for the methods considered. Yet, due to the high number of potential parameter combinations, further investigations of entropy for heart rate variability data will be necessary.

AB - BackgroundHeart rate variability is the variation of the time interval between consecutive heartbeats. Entropy is a commonly used tool to describe the regularity of data sets. Entropy functions are defined using multiple parameters, the selection of which is controversial and depends on the intended purpose. This study describes the results of tests conducted to support parameter selection, towards the goal of enabling further biomarker discovery.MethodsThis study deals with approximate, sample, fuzzy, and fuzzy measure entropies. All data were obtained from PhysioNet, a free-access, on-line archive of physiological signals, and represent various medical conditions. Five tests were defined and conducted to examine the influence of: varying the threshold value r (as multiples of the sample standard deviation σ, or the entropy-maximizing rChon), the data length N, the weighting factors n for fuzzy and fuzzy measure entropies, and the thresholds r F and r L for fuzzy measure entropy. The results were tested for normality using Lilliefors' composite goodness-of-fit test. Consequently, the p-value was calculated with either a two sample t-test or a Wilcoxon rank sum test.ResultsThe first test shows a cross-over of entropy values with regard to a change of r. Thus, a clear statement that a higher entropy corresponds to a high irregularity is not possible, but is rather an indicator of differences in regularity. N should be at least 200 data points for r = 0.2 σ and should even exceed a length of 1000 for r = rChon. The results for the weighting parameters n for the fuzzy membership function show different behavior when coupled with different r values, therefore the weighting parameters have been chosen independently for the different threshold values. The tests concerning r F and r L showed that there is no optimal choice, but r = r F = r L is reasonable with r = rChon or r = 0.2σ.ConclusionsSome of the tests showed a dependency of the test significance on the data at hand. Nevertheless, as the medical conditions are unknown beforehand, compromises had to be made. Optimal parameter combinations are suggested for the methods considered. Yet, due to the high number of potential parameter combinations, further investigations of entropy for heart rate variability data will be necessary.

KW - Entropy-based data mining

KW - Knowledge Discovery

KW - Health Informatics

KW - Parameter selection

KW - entropy

U2 - 10.1186/1471-2105-15-S6-S2

DO - 10.1186/1471-2105-15-S6-S2

M3 - Article

VL - 15

SP - 1

EP - 11

JO - BMC Bioinformatics

T2 - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

IS - S2

ER -