Curation of the genome annotation of Pichia pastoris (Komagataella phaffii) CBS7435 from gene level to protein function

Minoska Valli, Nadine E Tatto, Armin Peymann, Clemens Gruber, Nils Landes, Heinz Ekker, Gerhard G Thallinger, Diethard Mattanovich, Brigitte Gasser, Alexandra B Graf

Publikation: Beitrag in einer FachzeitschriftArtikelForschungBegutachtung

Abstract

As manually curated and non-automated BLAST analysis of the published Pichia pastoris genome sequences revealed many differences between the gene annotations of the strains GS115 and CBS7435, RNA-Seq analysis, supported by proteomics, was performed to improve the genome annotation. Detailed analysis of sequence alignment and protein domain predictions were made to extend the functional genome annotation to all P. pastoris sequences. This allowed the identification of 492 new ORFs, 4916 hypothetical UTRs and the correction of 341 incorrect ORF predictions, which were mainly due to the presence of upstream ATG or erroneous intron predictions. Moreover, 175 previously erroneously annotated ORFs need to be removed from the annotation. In total, we have annotated 5325 ORFs. Regarding the functionality of those genes, we improved all gene and protein descriptions. Thereby, the percentage of ORFs with functional annotation was increased from 48% to 73%. Furthermore, we defined functional groups, covering 25 biological cellular processes of interest, by grouping all genes that are part of the defined process. All data are presented in the newly launched genome browser and database available at www.pichiagenome.org In summary, we present a wide spectrum of curation of the P. pastoris genome annotation from gene level to protein function.

Originalspracheenglisch
Aufsatznummer fow051
FachzeitschriftFEMS yeast research
Jahrgang16
Ausgabenummer6
DOIs
PublikationsstatusVeröffentlicht - Sep 2016

Fingerprint

Pichia
Open Reading Frames
Genome
Molecular Sequence Annotation
Genes
Proteins
Untranslated Regions
Biological Phenomena
Sequence Alignment
Proteomics
Introns
Databases
RNA

Schlagwörter

    Dies zitieren

    Valli, M., Tatto, N. E., Peymann, A., Gruber, C., Landes, N., Ekker, H., ... Graf, A. B. (2016). Curation of the genome annotation of Pichia pastoris (Komagataella phaffii) CBS7435 from gene level to protein function. FEMS yeast research, 16(6), [ fow051]. https://doi.org/10.1093/femsyr/fow051

    Curation of the genome annotation of Pichia pastoris (Komagataella phaffii) CBS7435 from gene level to protein function. / Valli, Minoska; Tatto, Nadine E; Peymann, Armin; Gruber, Clemens; Landes, Nils; Ekker, Heinz; Thallinger, Gerhard G; Mattanovich, Diethard; Gasser, Brigitte; Graf, Alexandra B.

    in: FEMS yeast research, Jahrgang 16, Nr. 6, fow051, 09.2016.

    Publikation: Beitrag in einer FachzeitschriftArtikelForschungBegutachtung

    Valli, M, Tatto, NE, Peymann, A, Gruber, C, Landes, N, Ekker, H, Thallinger, GG, Mattanovich, D, Gasser, B & Graf, AB 2016, 'Curation of the genome annotation of Pichia pastoris (Komagataella phaffii) CBS7435 from gene level to protein function' FEMS yeast research, Jg. 16, Nr. 6, fow051. https://doi.org/10.1093/femsyr/fow051
    Valli, Minoska ; Tatto, Nadine E ; Peymann, Armin ; Gruber, Clemens ; Landes, Nils ; Ekker, Heinz ; Thallinger, Gerhard G ; Mattanovich, Diethard ; Gasser, Brigitte ; Graf, Alexandra B. / Curation of the genome annotation of Pichia pastoris (Komagataella phaffii) CBS7435 from gene level to protein function. in: FEMS yeast research. 2016 ; Jahrgang 16, Nr. 6.
    @article{ad2d2b4f97fd4457a52262e86ce1d87c,
    title = "Curation of the genome annotation of Pichia pastoris (Komagataella phaffii) CBS7435 from gene level to protein function",
    abstract = "As manually curated and non-automated BLAST analysis of the published Pichia pastoris genome sequences revealed many differences between the gene annotations of the strains GS115 and CBS7435, RNA-Seq analysis, supported by proteomics, was performed to improve the genome annotation. Detailed analysis of sequence alignment and protein domain predictions were made to extend the functional genome annotation to all P. pastoris sequences. This allowed the identification of 492 new ORFs, 4916 hypothetical UTRs and the correction of 341 incorrect ORF predictions, which were mainly due to the presence of upstream ATG or erroneous intron predictions. Moreover, 175 previously erroneously annotated ORFs need to be removed from the annotation. In total, we have annotated 5325 ORFs. Regarding the functionality of those genes, we improved all gene and protein descriptions. Thereby, the percentage of ORFs with functional annotation was increased from 48{\%} to 73{\%}. Furthermore, we defined functional groups, covering 25 biological cellular processes of interest, by grouping all genes that are part of the defined process. All data are presented in the newly launched genome browser and database available at www.pichiagenome.org In summary, we present a wide spectrum of curation of the P. pastoris genome annotation from gene level to protein function.",
    keywords = "Journal Article",
    author = "Minoska Valli and Tatto, {Nadine E} and Armin Peymann and Clemens Gruber and Nils Landes and Heinz Ekker and Thallinger, {Gerhard G} and Diethard Mattanovich and Brigitte Gasser and Graf, {Alexandra B}",
    note = "{\circledC} FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.",
    year = "2016",
    month = "9",
    doi = "10.1093/femsyr/fow051",
    language = "English",
    volume = "16",
    journal = "FEMS yeast research",
    issn = "1567-1356",
    publisher = "Wiley-Blackwell",
    number = "6",

    }

    TY - JOUR

    T1 - Curation of the genome annotation of Pichia pastoris (Komagataella phaffii) CBS7435 from gene level to protein function

    AU - Valli, Minoska

    AU - Tatto, Nadine E

    AU - Peymann, Armin

    AU - Gruber, Clemens

    AU - Landes, Nils

    AU - Ekker, Heinz

    AU - Thallinger, Gerhard G

    AU - Mattanovich, Diethard

    AU - Gasser, Brigitte

    AU - Graf, Alexandra B

    N1 - © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

    PY - 2016/9

    Y1 - 2016/9

    N2 - As manually curated and non-automated BLAST analysis of the published Pichia pastoris genome sequences revealed many differences between the gene annotations of the strains GS115 and CBS7435, RNA-Seq analysis, supported by proteomics, was performed to improve the genome annotation. Detailed analysis of sequence alignment and protein domain predictions were made to extend the functional genome annotation to all P. pastoris sequences. This allowed the identification of 492 new ORFs, 4916 hypothetical UTRs and the correction of 341 incorrect ORF predictions, which were mainly due to the presence of upstream ATG or erroneous intron predictions. Moreover, 175 previously erroneously annotated ORFs need to be removed from the annotation. In total, we have annotated 5325 ORFs. Regarding the functionality of those genes, we improved all gene and protein descriptions. Thereby, the percentage of ORFs with functional annotation was increased from 48% to 73%. Furthermore, we defined functional groups, covering 25 biological cellular processes of interest, by grouping all genes that are part of the defined process. All data are presented in the newly launched genome browser and database available at www.pichiagenome.org In summary, we present a wide spectrum of curation of the P. pastoris genome annotation from gene level to protein function.

    AB - As manually curated and non-automated BLAST analysis of the published Pichia pastoris genome sequences revealed many differences between the gene annotations of the strains GS115 and CBS7435, RNA-Seq analysis, supported by proteomics, was performed to improve the genome annotation. Detailed analysis of sequence alignment and protein domain predictions were made to extend the functional genome annotation to all P. pastoris sequences. This allowed the identification of 492 new ORFs, 4916 hypothetical UTRs and the correction of 341 incorrect ORF predictions, which were mainly due to the presence of upstream ATG or erroneous intron predictions. Moreover, 175 previously erroneously annotated ORFs need to be removed from the annotation. In total, we have annotated 5325 ORFs. Regarding the functionality of those genes, we improved all gene and protein descriptions. Thereby, the percentage of ORFs with functional annotation was increased from 48% to 73%. Furthermore, we defined functional groups, covering 25 biological cellular processes of interest, by grouping all genes that are part of the defined process. All data are presented in the newly launched genome browser and database available at www.pichiagenome.org In summary, we present a wide spectrum of curation of the P. pastoris genome annotation from gene level to protein function.

    KW - Journal Article

    U2 - 10.1093/femsyr/fow051

    DO - 10.1093/femsyr/fow051

    M3 - Article

    VL - 16

    JO - FEMS yeast research

    JF - FEMS yeast research

    SN - 1567-1356

    IS - 6

    M1 - fow051

    ER -