Genome-wide CTCF distribution in vertebrates defines equivalent sites that aid the identification of disease-associated genes

David Martin, Cristina Pantoja, Ana Fernández Miñán, Christian Valdes-Quezada, Eduardo Moltó, Fuencisla Matesanz, Ozren Bogdanović, Elisa de la Calle-Mustienes, Orlando Domínguez, Leila Taher, Mayra Furlan-Magaril, Antonio Alcina, Susana Cañón, María Fedetz, María A Blasco, Paulo S Pereira, Ivan Ovcharenko, Félix Recillas-Targa, Lluís Montoliu, Miguel Manzanares & 4 others Roderic Guigó, Manuel Serrano, Fernando Casares, José Luis Gómez-Skarmeta

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Many genomic alterations associated with human diseases localize in noncoding regulatory elements located far from the promoters they regulate, making it challenging to link noncoding mutations or risk-associated variants with target genes. The range of action of a given set of enhancers is thought to be defined by insulator elements bound by the 11 zinc-finger nuclear factor CCCTC-binding protein (CTCF). Here we analyzed the genomic distribution of CTCF in various human, mouse and chicken cell types, demonstrating the existence of evolutionarily conserved CTCF-bound sites beyond mammals. These sites preferentially flank transcription factor-encoding genes, often associated with human diseases, and function as enhancer blockers in vivo, suggesting that they act as evolutionarily invariant gene boundaries. We then applied this concept to predict and functionally demonstrate that the polymorphic variants associated with multiple sclerosis located within the EVI5 gene impinge on the adjacent gene GFI1.

Original languageEnglish
Pages (from-to)708-14
Number of pages7
JournalNature structural & molecular biology
Volume18
Issue number6
DOIs
Publication statusPublished - Jun 2011

Fingerprint

Vertebrates
Genome
Genes
Insulator Elements
Zinc Fingers
Multiple Sclerosis
Mammals
Chickens
Carrier Proteins
Transcription Factors
Mutation

Keywords

  • Animals
  • CCCTC-Binding Factor
  • Cell Line
  • Chickens
  • Conserved Sequence
  • DNA/metabolism
  • DNA-Binding Proteins/genetics
  • Genome
  • Humans
  • Mice
  • Multiple Sclerosis/pathology
  • Nuclear Proteins/genetics
  • Polymorphism, Genetic
  • Protein Binding
  • Repressor Proteins/metabolism
  • Transcription Factors/genetics

Cite this

Martin, D., Pantoja, C., Fernández Miñán, A., Valdes-Quezada, C., Moltó, E., Matesanz, F., ... Gómez-Skarmeta, J. L. (2011). Genome-wide CTCF distribution in vertebrates defines equivalent sites that aid the identification of disease-associated genes. Nature structural & molecular biology, 18(6), 708-14. https://doi.org/10.1038/nsmb.2059

Genome-wide CTCF distribution in vertebrates defines equivalent sites that aid the identification of disease-associated genes. / Martin, David; Pantoja, Cristina; Fernández Miñán, Ana; Valdes-Quezada, Christian; Moltó, Eduardo; Matesanz, Fuencisla; Bogdanović, Ozren; de la Calle-Mustienes, Elisa; Domínguez, Orlando; Taher, Leila; Furlan-Magaril, Mayra; Alcina, Antonio; Cañón, Susana; Fedetz, María; Blasco, María A; Pereira, Paulo S; Ovcharenko, Ivan; Recillas-Targa, Félix; Montoliu, Lluís; Manzanares, Miguel; Guigó, Roderic; Serrano, Manuel; Casares, Fernando; Gómez-Skarmeta, José Luis.

In: Nature structural & molecular biology, Vol. 18, No. 6, 06.2011, p. 708-14.

Research output: Contribution to journalArticleResearchpeer-review

Martin, D, Pantoja, C, Fernández Miñán, A, Valdes-Quezada, C, Moltó, E, Matesanz, F, Bogdanović, O, de la Calle-Mustienes, E, Domínguez, O, Taher, L, Furlan-Magaril, M, Alcina, A, Cañón, S, Fedetz, M, Blasco, MA, Pereira, PS, Ovcharenko, I, Recillas-Targa, F, Montoliu, L, Manzanares, M, Guigó, R, Serrano, M, Casares, F & Gómez-Skarmeta, JL 2011, 'Genome-wide CTCF distribution in vertebrates defines equivalent sites that aid the identification of disease-associated genes' Nature structural & molecular biology, vol. 18, no. 6, pp. 708-14. https://doi.org/10.1038/nsmb.2059
Martin, David ; Pantoja, Cristina ; Fernández Miñán, Ana ; Valdes-Quezada, Christian ; Moltó, Eduardo ; Matesanz, Fuencisla ; Bogdanović, Ozren ; de la Calle-Mustienes, Elisa ; Domínguez, Orlando ; Taher, Leila ; Furlan-Magaril, Mayra ; Alcina, Antonio ; Cañón, Susana ; Fedetz, María ; Blasco, María A ; Pereira, Paulo S ; Ovcharenko, Ivan ; Recillas-Targa, Félix ; Montoliu, Lluís ; Manzanares, Miguel ; Guigó, Roderic ; Serrano, Manuel ; Casares, Fernando ; Gómez-Skarmeta, José Luis. / Genome-wide CTCF distribution in vertebrates defines equivalent sites that aid the identification of disease-associated genes. In: Nature structural & molecular biology. 2011 ; Vol. 18, No. 6. pp. 708-14.
@article{8d3d77d9504942919081fedb4b6defa9,
title = "Genome-wide CTCF distribution in vertebrates defines equivalent sites that aid the identification of disease-associated genes",
abstract = "Many genomic alterations associated with human diseases localize in noncoding regulatory elements located far from the promoters they regulate, making it challenging to link noncoding mutations or risk-associated variants with target genes. The range of action of a given set of enhancers is thought to be defined by insulator elements bound by the 11 zinc-finger nuclear factor CCCTC-binding protein (CTCF). Here we analyzed the genomic distribution of CTCF in various human, mouse and chicken cell types, demonstrating the existence of evolutionarily conserved CTCF-bound sites beyond mammals. These sites preferentially flank transcription factor-encoding genes, often associated with human diseases, and function as enhancer blockers in vivo, suggesting that they act as evolutionarily invariant gene boundaries. We then applied this concept to predict and functionally demonstrate that the polymorphic variants associated with multiple sclerosis located within the EVI5 gene impinge on the adjacent gene GFI1.",
keywords = "Animals, CCCTC-Binding Factor, Cell Line, Chickens, Conserved Sequence, DNA/metabolism, DNA-Binding Proteins/genetics, Genome, Humans, Mice, Multiple Sclerosis/pathology, Nuclear Proteins/genetics, Polymorphism, Genetic, Protein Binding, Repressor Proteins/metabolism, Transcription Factors/genetics",
author = "David Martin and Cristina Pantoja and {Fern{\'a}ndez Mi{\~n}{\'a}n}, Ana and Christian Valdes-Quezada and Eduardo Molt{\'o} and Fuencisla Matesanz and Ozren Bogdanović and {de la Calle-Mustienes}, Elisa and Orlando Dom{\'i}nguez and Leila Taher and Mayra Furlan-Magaril and Antonio Alcina and Susana Ca{\~n}{\'o}n and Mar{\'i}a Fedetz and Blasco, {Mar{\'i}a A} and Pereira, {Paulo S} and Ivan Ovcharenko and F{\'e}lix Recillas-Targa and Llu{\'i}s Montoliu and Miguel Manzanares and Roderic Guig{\'o} and Manuel Serrano and Fernando Casares and G{\'o}mez-Skarmeta, {Jos{\'e} Luis}",
year = "2011",
month = "6",
doi = "10.1038/nsmb.2059",
language = "English",
volume = "18",
pages = "708--14",
journal = "Nature structural & molecular biology",
issn = "1545-9993",
publisher = "Nature Publishing Group",
number = "6",

}

TY - JOUR

T1 - Genome-wide CTCF distribution in vertebrates defines equivalent sites that aid the identification of disease-associated genes

AU - Martin, David

AU - Pantoja, Cristina

AU - Fernández Miñán, Ana

AU - Valdes-Quezada, Christian

AU - Moltó, Eduardo

AU - Matesanz, Fuencisla

AU - Bogdanović, Ozren

AU - de la Calle-Mustienes, Elisa

AU - Domínguez, Orlando

AU - Taher, Leila

AU - Furlan-Magaril, Mayra

AU - Alcina, Antonio

AU - Cañón, Susana

AU - Fedetz, María

AU - Blasco, María A

AU - Pereira, Paulo S

AU - Ovcharenko, Ivan

AU - Recillas-Targa, Félix

AU - Montoliu, Lluís

AU - Manzanares, Miguel

AU - Guigó, Roderic

AU - Serrano, Manuel

AU - Casares, Fernando

AU - Gómez-Skarmeta, José Luis

PY - 2011/6

Y1 - 2011/6

N2 - Many genomic alterations associated with human diseases localize in noncoding regulatory elements located far from the promoters they regulate, making it challenging to link noncoding mutations or risk-associated variants with target genes. The range of action of a given set of enhancers is thought to be defined by insulator elements bound by the 11 zinc-finger nuclear factor CCCTC-binding protein (CTCF). Here we analyzed the genomic distribution of CTCF in various human, mouse and chicken cell types, demonstrating the existence of evolutionarily conserved CTCF-bound sites beyond mammals. These sites preferentially flank transcription factor-encoding genes, often associated with human diseases, and function as enhancer blockers in vivo, suggesting that they act as evolutionarily invariant gene boundaries. We then applied this concept to predict and functionally demonstrate that the polymorphic variants associated with multiple sclerosis located within the EVI5 gene impinge on the adjacent gene GFI1.

AB - Many genomic alterations associated with human diseases localize in noncoding regulatory elements located far from the promoters they regulate, making it challenging to link noncoding mutations or risk-associated variants with target genes. The range of action of a given set of enhancers is thought to be defined by insulator elements bound by the 11 zinc-finger nuclear factor CCCTC-binding protein (CTCF). Here we analyzed the genomic distribution of CTCF in various human, mouse and chicken cell types, demonstrating the existence of evolutionarily conserved CTCF-bound sites beyond mammals. These sites preferentially flank transcription factor-encoding genes, often associated with human diseases, and function as enhancer blockers in vivo, suggesting that they act as evolutionarily invariant gene boundaries. We then applied this concept to predict and functionally demonstrate that the polymorphic variants associated with multiple sclerosis located within the EVI5 gene impinge on the adjacent gene GFI1.

KW - Animals

KW - CCCTC-Binding Factor

KW - Cell Line

KW - Chickens

KW - Conserved Sequence

KW - DNA/metabolism

KW - DNA-Binding Proteins/genetics

KW - Genome

KW - Humans

KW - Mice

KW - Multiple Sclerosis/pathology

KW - Nuclear Proteins/genetics

KW - Polymorphism, Genetic

KW - Protein Binding

KW - Repressor Proteins/metabolism

KW - Transcription Factors/genetics

U2 - 10.1038/nsmb.2059

DO - 10.1038/nsmb.2059

M3 - Article

VL - 18

SP - 708

EP - 714

JO - Nature structural & molecular biology

JF - Nature structural & molecular biology

SN - 1545-9993

IS - 6

ER -