Curation of the genome annotation of Pichia pastoris (Komagataella phaffii) CBS7435 from gene level to protein function

Minoska Valli, Nadine E Tatto, Armin Peymann, Clemens Gruber, Nils Landes, Heinz Ekker, Gerhard G Thallinger, Diethard Mattanovich, Brigitte Gasser, Alexandra B Graf

Research output: Contribution to journalArticle

Abstract

As manually curated and non-automated BLAST analysis of the published Pichia pastoris genome sequences revealed many differences between the gene annotations of the strains GS115 and CBS7435, RNA-Seq analysis, supported by proteomics, was performed to improve the genome annotation. Detailed analysis of sequence alignment and protein domain predictions were made to extend the functional genome annotation to all P. pastoris sequences. This allowed the identification of 492 new ORFs, 4916 hypothetical UTRs and the correction of 341 incorrect ORF predictions, which were mainly due to the presence of upstream ATG or erroneous intron predictions. Moreover, 175 previously erroneously annotated ORFs need to be removed from the annotation. In total, we have annotated 5325 ORFs. Regarding the functionality of those genes, we improved all gene and protein descriptions. Thereby, the percentage of ORFs with functional annotation was increased from 48% to 73%. Furthermore, we defined functional groups, covering 25 biological cellular processes of interest, by grouping all genes that are part of the defined process. All data are presented in the newly launched genome browser and database available at www.pichiagenome.org In summary, we present a wide spectrum of curation of the P. pastoris genome annotation from gene level to protein function.

LanguageEnglish
Article number fow051
JournalFEMS yeast research
Volume16
Issue number6
DOIs
StatusPublished - Sep 2016

Fingerprint

Pichia
Open Reading Frames
Genome
Molecular Sequence Annotation
Genes
Proteins
Untranslated Regions
Biological Phenomena
Sequence Alignment
Proteomics
Introns
Databases
RNA

Keywords

  • Journal Article

Cite this

Valli, M., Tatto, N. E., Peymann, A., Gruber, C., Landes, N., Ekker, H., ... Graf, A. B. (2016). Curation of the genome annotation of Pichia pastoris (Komagataella phaffii) CBS7435 from gene level to protein function. FEMS yeast research, 16(6), [ fow051]. DOI: 10.1093/femsyr/fow051

Curation of the genome annotation of Pichia pastoris (Komagataella phaffii) CBS7435 from gene level to protein function. / Valli, Minoska; Tatto, Nadine E; Peymann, Armin; Gruber, Clemens; Landes, Nils; Ekker, Heinz; Thallinger, Gerhard G; Mattanovich, Diethard; Gasser, Brigitte; Graf, Alexandra B.

In: FEMS yeast research, Vol. 16, No. 6, fow051, 09.2016.

Research output: Contribution to journalArticle

Valli, M, Tatto, NE, Peymann, A, Gruber, C, Landes, N, Ekker, H, Thallinger, GG, Mattanovich, D, Gasser, B & Graf, AB 2016, 'Curation of the genome annotation of Pichia pastoris (Komagataella phaffii) CBS7435 from gene level to protein function' FEMS yeast research, vol 16, no. 6, fow051. DOI: 10.1093/femsyr/fow051
Valli M, Tatto NE, Peymann A, Gruber C, Landes N, Ekker H et al. Curation of the genome annotation of Pichia pastoris (Komagataella phaffii) CBS7435 from gene level to protein function. FEMS yeast research. 2016 Sep;16(6). fow051. Available from, DOI: 10.1093/femsyr/fow051
Valli, Minoska ; Tatto, Nadine E ; Peymann, Armin ; Gruber, Clemens ; Landes, Nils ; Ekker, Heinz ; Thallinger, Gerhard G ; Mattanovich, Diethard ; Gasser, Brigitte ; Graf, Alexandra B. / Curation of the genome annotation of Pichia pastoris (Komagataella phaffii) CBS7435 from gene level to protein function. In: FEMS yeast research. 2016 ; Vol. 16, No. 6.
@article{ad2d2b4f97fd4457a52262e86ce1d87c,
title = "Curation of the genome annotation of Pichia pastoris (Komagataella phaffii) CBS7435 from gene level to protein function",
abstract = "As manually curated and non-automated BLAST analysis of the published Pichia pastoris genome sequences revealed many differences between the gene annotations of the strains GS115 and CBS7435, RNA-Seq analysis, supported by proteomics, was performed to improve the genome annotation. Detailed analysis of sequence alignment and protein domain predictions were made to extend the functional genome annotation to all P. pastoris sequences. This allowed the identification of 492 new ORFs, 4916 hypothetical UTRs and the correction of 341 incorrect ORF predictions, which were mainly due to the presence of upstream ATG or erroneous intron predictions. Moreover, 175 previously erroneously annotated ORFs need to be removed from the annotation. In total, we have annotated 5325 ORFs. Regarding the functionality of those genes, we improved all gene and protein descriptions. Thereby, the percentage of ORFs with functional annotation was increased from 48{\%} to 73{\%}. Furthermore, we defined functional groups, covering 25 biological cellular processes of interest, by grouping all genes that are part of the defined process. All data are presented in the newly launched genome browser and database available at www.pichiagenome.org In summary, we present a wide spectrum of curation of the P. pastoris genome annotation from gene level to protein function.",
keywords = "Journal Article",
author = "Minoska Valli and Tatto, {Nadine E} and Armin Peymann and Clemens Gruber and Nils Landes and Heinz Ekker and Thallinger, {Gerhard G} and Diethard Mattanovich and Brigitte Gasser and Graf, {Alexandra B}",
note = "{\circledC} FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.",
year = "2016",
month = "9",
doi = "10.1093/femsyr/fow051",
language = "English",
volume = "16",
journal = "FEMS yeast research",
issn = "1567-1356",
publisher = "Wiley-Blackwell",
number = "6",

}

TY - JOUR

T1 - Curation of the genome annotation of Pichia pastoris (Komagataella phaffii) CBS7435 from gene level to protein function

AU - Valli,Minoska

AU - Tatto,Nadine E

AU - Peymann,Armin

AU - Gruber,Clemens

AU - Landes,Nils

AU - Ekker,Heinz

AU - Thallinger,Gerhard G

AU - Mattanovich,Diethard

AU - Gasser,Brigitte

AU - Graf,Alexandra B

N1 - © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

PY - 2016/9

Y1 - 2016/9

N2 - As manually curated and non-automated BLAST analysis of the published Pichia pastoris genome sequences revealed many differences between the gene annotations of the strains GS115 and CBS7435, RNA-Seq analysis, supported by proteomics, was performed to improve the genome annotation. Detailed analysis of sequence alignment and protein domain predictions were made to extend the functional genome annotation to all P. pastoris sequences. This allowed the identification of 492 new ORFs, 4916 hypothetical UTRs and the correction of 341 incorrect ORF predictions, which were mainly due to the presence of upstream ATG or erroneous intron predictions. Moreover, 175 previously erroneously annotated ORFs need to be removed from the annotation. In total, we have annotated 5325 ORFs. Regarding the functionality of those genes, we improved all gene and protein descriptions. Thereby, the percentage of ORFs with functional annotation was increased from 48% to 73%. Furthermore, we defined functional groups, covering 25 biological cellular processes of interest, by grouping all genes that are part of the defined process. All data are presented in the newly launched genome browser and database available at www.pichiagenome.org In summary, we present a wide spectrum of curation of the P. pastoris genome annotation from gene level to protein function.

AB - As manually curated and non-automated BLAST analysis of the published Pichia pastoris genome sequences revealed many differences between the gene annotations of the strains GS115 and CBS7435, RNA-Seq analysis, supported by proteomics, was performed to improve the genome annotation. Detailed analysis of sequence alignment and protein domain predictions were made to extend the functional genome annotation to all P. pastoris sequences. This allowed the identification of 492 new ORFs, 4916 hypothetical UTRs and the correction of 341 incorrect ORF predictions, which were mainly due to the presence of upstream ATG or erroneous intron predictions. Moreover, 175 previously erroneously annotated ORFs need to be removed from the annotation. In total, we have annotated 5325 ORFs. Regarding the functionality of those genes, we improved all gene and protein descriptions. Thereby, the percentage of ORFs with functional annotation was increased from 48% to 73%. Furthermore, we defined functional groups, covering 25 biological cellular processes of interest, by grouping all genes that are part of the defined process. All data are presented in the newly launched genome browser and database available at www.pichiagenome.org In summary, we present a wide spectrum of curation of the P. pastoris genome annotation from gene level to protein function.

KW - Journal Article

U2 - 10.1093/femsyr/fow051

DO - 10.1093/femsyr/fow051

M3 - Article

VL - 16

JO - FEMS yeast research

T2 - FEMS yeast research

JF - FEMS yeast research

SN - 1567-1356

IS - 6

M1 - fow051

ER -