Use case driven evaluation of open databases for pediatric cancer research

Research output: Contribution to journalArticleResearchpeer-review

Abstract


Background

A plethora of Web resources are available offering information on clinical, pre-clinical, genomic and theoretical aspects of cancer, including not only the comprehensive cancer projects as ICGC and TCGA, but also less-known and more specialized projects on pediatric diseases such as PCGP. However, in case of data on childhood cancer there is very little information openly available. Several web-based resources and tools offer general biomedical data which are not purpose-built, for neither pediatric nor cancer analysis. Additionally, many Web resources on cancer focus on incidence data and statistical social characteristics as well as self-regulating communities.
Methods

We summarize those resources which are open and are considered to support scientific fundamental research, while we address our comparison to 11 identified pediatric cancer-specific resources (5 tools, 6 databases). The evaluation consists of 5 use cases on the example of brain tumor research and covers user-defined search scenarios as well as data mining tasks, also examining interactive visual analysis features.
Results

Web resources differ in terms of information quantity and presentation. Pedican lists an abundance of entries with few selection features. PeCan and PedcBioPortal include visual analysis tools while the latter integrates published and new consortia-based data. UCSC Xena Browser offers an in-depth analysis of genomic data. ICGC data portal provides various features for data analysis and an option to submit own data. Its focus lies on adult Pan-Cancer projects. Pediatric Pan-Cancer datasets are being integrated into PeCan and PedcBioPortal. Comparing information on prominent mutations within glioma discloses well-known, unknown, possible, as well as inapplicable biomarkers. This summary further emphasizes the varying data allocation. Tested tools show advantages and disadvantages, depending on the respective use case scenario, providing inhomogeneous data quantity and information specifics.
Conclusions

Web resources on specific pediatric cancers are less abundant and less-known compared to those offering adult cancer research data. Meanwhile, current efforts of ongoing pediatric data collection and Pan-Cancer projects indicate future opportunities for childhood cancer research, that is greatly needed for both fundamental as well as clinical research.
LanguageEnglish
Article number2
Pages2
Number of pages20
JournalBioData Mining
Volume12
Issue number1
DOIs
StatusPublished - 15 Jan 2019

Fingerprint

Pediatrics
Use Case
Cancer
Databases
Evaluation
Research
Neoplasms
Resources
Carya
Biomarkers
Genomics
Data mining
Feature extraction
Tumors
Brain
Data Allocation
Brain Tumor
Scenarios
Data Mining
Brain Neoplasms

Keywords

  • In silico analysis
  • brain tumor
  • cancer database
  • Brain tumor
  • Cancer database
  • Glioma
  • Open research
  • Pediatric oncology
  • Childhood cancer

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computational Mathematics
  • Genetics
  • Molecular Biology
  • Biochemistry
  • Computer Science Applications
  • Computational Theory and Mathematics

Fields of Expertise

  • Information, Communication & Computing

Treatment code (Nähere Zuordnung)

  • Basic - Fundamental (Grundlagenforschung)

Cite this

Use case driven evaluation of open databases for pediatric cancer research. / Jeanquartier, Fleur; Jeanquartier, Claire; Holzinger, Andreas.

In: BioData Mining, Vol. 12, No. 1, 2, 15.01.2019, p. 2.

Research output: Contribution to journalArticleResearchpeer-review

@article{2f57e625d6614ee0bb675f38ac4636a2,
title = "Use case driven evaluation of open databases for pediatric cancer research",
abstract = "BackgroundA plethora of Web resources are available offering information on clinical, pre-clinical, genomic and theoretical aspects of cancer, including not only the comprehensive cancer projects as ICGC and TCGA, but also less-known and more specialized projects on pediatric diseases such as PCGP. However, in case of data on childhood cancer there is very little information openly available. Several web-based resources and tools offer general biomedical data which are not purpose-built, for neither pediatric nor cancer analysis. Additionally, many Web resources on cancer focus on incidence data and statistical social characteristics as well as self-regulating communities.MethodsWe summarize those resources which are open and are considered to support scientific fundamental research, while we address our comparison to 11 identified pediatric cancer-specific resources (5 tools, 6 databases). The evaluation consists of 5 use cases on the example of brain tumor research and covers user-defined search scenarios as well as data mining tasks, also examining interactive visual analysis features.ResultsWeb resources differ in terms of information quantity and presentation. Pedican lists an abundance of entries with few selection features. PeCan and PedcBioPortal include visual analysis tools while the latter integrates published and new consortia-based data. UCSC Xena Browser offers an in-depth analysis of genomic data. ICGC data portal provides various features for data analysis and an option to submit own data. Its focus lies on adult Pan-Cancer projects. Pediatric Pan-Cancer datasets are being integrated into PeCan and PedcBioPortal. Comparing information on prominent mutations within glioma discloses well-known, unknown, possible, as well as inapplicable biomarkers. This summary further emphasizes the varying data allocation. Tested tools show advantages and disadvantages, depending on the respective use case scenario, providing inhomogeneous data quantity and information specifics.ConclusionsWeb resources on specific pediatric cancers are less abundant and less-known compared to those offering adult cancer research data. Meanwhile, current efforts of ongoing pediatric data collection and Pan-Cancer projects indicate future opportunities for childhood cancer research, that is greatly needed for both fundamental as well as clinical research.",
keywords = "In silico analysis, brain tumor, cancer database, Brain tumor, Cancer database, Glioma, Open research, Pediatric oncology, Childhood cancer",
author = "Fleur Jeanquartier and Claire Jeanquartier and Andreas Holzinger",
year = "2019",
month = "1",
day = "15",
doi = "10.1186/s13040-018-0190-8",
language = "English",
volume = "12",
pages = "2",
journal = "BioData Mining",
issn = "1756-0381",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - Use case driven evaluation of open databases for pediatric cancer research

AU - Jeanquartier, Fleur

AU - Jeanquartier, Claire

AU - Holzinger, Andreas

PY - 2019/1/15

Y1 - 2019/1/15

N2 - BackgroundA plethora of Web resources are available offering information on clinical, pre-clinical, genomic and theoretical aspects of cancer, including not only the comprehensive cancer projects as ICGC and TCGA, but also less-known and more specialized projects on pediatric diseases such as PCGP. However, in case of data on childhood cancer there is very little information openly available. Several web-based resources and tools offer general biomedical data which are not purpose-built, for neither pediatric nor cancer analysis. Additionally, many Web resources on cancer focus on incidence data and statistical social characteristics as well as self-regulating communities.MethodsWe summarize those resources which are open and are considered to support scientific fundamental research, while we address our comparison to 11 identified pediatric cancer-specific resources (5 tools, 6 databases). The evaluation consists of 5 use cases on the example of brain tumor research and covers user-defined search scenarios as well as data mining tasks, also examining interactive visual analysis features.ResultsWeb resources differ in terms of information quantity and presentation. Pedican lists an abundance of entries with few selection features. PeCan and PedcBioPortal include visual analysis tools while the latter integrates published and new consortia-based data. UCSC Xena Browser offers an in-depth analysis of genomic data. ICGC data portal provides various features for data analysis and an option to submit own data. Its focus lies on adult Pan-Cancer projects. Pediatric Pan-Cancer datasets are being integrated into PeCan and PedcBioPortal. Comparing information on prominent mutations within glioma discloses well-known, unknown, possible, as well as inapplicable biomarkers. This summary further emphasizes the varying data allocation. Tested tools show advantages and disadvantages, depending on the respective use case scenario, providing inhomogeneous data quantity and information specifics.ConclusionsWeb resources on specific pediatric cancers are less abundant and less-known compared to those offering adult cancer research data. Meanwhile, current efforts of ongoing pediatric data collection and Pan-Cancer projects indicate future opportunities for childhood cancer research, that is greatly needed for both fundamental as well as clinical research.

AB - BackgroundA plethora of Web resources are available offering information on clinical, pre-clinical, genomic and theoretical aspects of cancer, including not only the comprehensive cancer projects as ICGC and TCGA, but also less-known and more specialized projects on pediatric diseases such as PCGP. However, in case of data on childhood cancer there is very little information openly available. Several web-based resources and tools offer general biomedical data which are not purpose-built, for neither pediatric nor cancer analysis. Additionally, many Web resources on cancer focus on incidence data and statistical social characteristics as well as self-regulating communities.MethodsWe summarize those resources which are open and are considered to support scientific fundamental research, while we address our comparison to 11 identified pediatric cancer-specific resources (5 tools, 6 databases). The evaluation consists of 5 use cases on the example of brain tumor research and covers user-defined search scenarios as well as data mining tasks, also examining interactive visual analysis features.ResultsWeb resources differ in terms of information quantity and presentation. Pedican lists an abundance of entries with few selection features. PeCan and PedcBioPortal include visual analysis tools while the latter integrates published and new consortia-based data. UCSC Xena Browser offers an in-depth analysis of genomic data. ICGC data portal provides various features for data analysis and an option to submit own data. Its focus lies on adult Pan-Cancer projects. Pediatric Pan-Cancer datasets are being integrated into PeCan and PedcBioPortal. Comparing information on prominent mutations within glioma discloses well-known, unknown, possible, as well as inapplicable biomarkers. This summary further emphasizes the varying data allocation. Tested tools show advantages and disadvantages, depending on the respective use case scenario, providing inhomogeneous data quantity and information specifics.ConclusionsWeb resources on specific pediatric cancers are less abundant and less-known compared to those offering adult cancer research data. Meanwhile, current efforts of ongoing pediatric data collection and Pan-Cancer projects indicate future opportunities for childhood cancer research, that is greatly needed for both fundamental as well as clinical research.

KW - In silico analysis

KW - brain tumor

KW - cancer database

KW - Brain tumor

KW - Cancer database

KW - Glioma

KW - Open research

KW - Pediatric oncology

KW - Childhood cancer

UR - http://www.scopus.com/inward/record.url?scp=85059987639&partnerID=8YFLogxK

U2 - 10.1186/s13040-018-0190-8

DO - 10.1186/s13040-018-0190-8

M3 - Article

VL - 12

SP - 2

JO - BioData Mining

T2 - BioData Mining

JF - BioData Mining

SN - 1756-0381

IS - 1

M1 - 2

ER -