Graphs in phylogenetic comparative analysis: Anscombe's quartet revisited

Liam J. Revell; Klaus Schliep; Eugenio Valderrama; James E. Richardson

doi:10.1111/2041-210X.13067

Graphs in phylogenetic comparative analysis: Anscombe's quartet revisited

Liam J. Revell^*, Klaus Schliep, Eugenio Valderrama, James E. Richardson

^*Korrespondierende/r Autor/-in für diese Arbeit

Publikation: Beitrag in einer Fachzeitschrift › Artikel › Begutachtung

Abstract

In 1973, the statistician Francis Anscombe used a clever set of bivariate datasets (now known as Anscombe's quartet) to illustrate the importance of graphing data as a component of statistical analyses. In his example, each of the four datasets yielded identical regression coefficients and model fits, and yet when visualized revealed strikingly different patterns of covariation between x and y. Phylogenetic comparative methods (the set of methodologies that use phylogenies, often combined with phenotypic trait data, to make inferences about evolution) are statistical methods too; yet visualizing the data and phylogeny in a sensible way that would permit us to detect unexpected patterns or unanticipated deviations from model assumptions is not a routine component of phylogenetic comparative analyses. Here, we use a quartet of phylogenetic datasets to illustrate that the same estimated parameters and model fits can be obtained from data that were generated using markedly different procedures—including pure Brownian motion evolution and randomly selected data uncorrelated with the tree. Just as in the case of Anscombe's quartet, when graphed the differences between the four datasets are quickly revealed. The intent of this article is to help build the general case that phylogenetic comparative methods are statistical methods and consequently that graphing or visualization should invariably be included as an essential step in our standard data analytical pipelines. Phylogenies are complex data structures and thus visualizing data on trees in a meaningful and useful way is a challenging endeavour. We recommend that the development of graphical methods for simultaneously visualizing data and tree should continue to be an important goal in phylogenetic comparative biology.

Originalsprache	englisch
Seiten (von - bis)	2145-2154
Seitenumfang	10
Fachzeitschrift	Methods in Ecology and Evolution
Jahrgang	9
Ausgabenummer	10
DOIs	https://doi.org/10.1111/2041-210X.13067
Publikationsstatus	Veröffentlicht - 1 Okt. 2018
Extern publiziert	Ja

ASJC Scopus subject areas

Ökologie, Evolution, Verhaltenswissenschaften und Systematik
Ökologische Modellierung

Zugriff auf Dokument

10.1111/2041-210X.13067

Andere Dateien und Links

http://www.scopus.com/inward/record.url?scp=85052841574&partnerID=8YFLogxK

Dieses zitieren

@article{c739eac3e9df48fe8871881c713bff4e,

title = "Graphs in phylogenetic comparative analysis: Anscombe's quartet revisited",

abstract = "In 1973, the statistician Francis Anscombe used a clever set of bivariate datasets (now known as Anscombe's quartet) to illustrate the importance of graphing data as a component of statistical analyses. In his example, each of the four datasets yielded identical regression coefficients and model fits, and yet when visualized revealed strikingly different patterns of covariation between x and y. Phylogenetic comparative methods (the set of methodologies that use phylogenies, often combined with phenotypic trait data, to make inferences about evolution) are statistical methods too; yet visualizing the data and phylogeny in a sensible way that would permit us to detect unexpected patterns or unanticipated deviations from model assumptions is not a routine component of phylogenetic comparative analyses. Here, we use a quartet of phylogenetic datasets to illustrate that the same estimated parameters and model fits can be obtained from data that were generated using markedly different procedures—including pure Brownian motion evolution and randomly selected data uncorrelated with the tree. Just as in the case of Anscombe's quartet, when graphed the differences between the four datasets are quickly revealed. The intent of this article is to help build the general case that phylogenetic comparative methods are statistical methods and consequently that graphing or visualization should invariably be included as an essential step in our standard data analytical pipelines. Phylogenies are complex data structures and thus visualizing data on trees in a meaningful and useful way is a challenging endeavour. We recommend that the development of graphical methods for simultaneously visualizing data and tree should continue to be an important goal in phylogenetic comparative biology.",

keywords = "comparative methods, macroevolution, phylogeny, plotting, visualization",

author = "Revell, {Liam J.} and Klaus Schliep and Eugenio Valderrama and Richardson, {James E.}",

year = "2018",

month = oct,

day = "1",

doi = "10.1111/2041-210X.13067",

language = "English",

volume = "9",

pages = "2145--2154",

journal = "Methods in Ecology and Evolution",

issn = "2041-210X",

publisher = "British Ecological Society",

number = "10",

}

TY - JOUR

T1 - Graphs in phylogenetic comparative analysis

T2 - Anscombe's quartet revisited

AU - Revell, Liam J.

AU - Schliep, Klaus

AU - Valderrama, Eugenio

AU - Richardson, James E.

PY - 2018/10/1

Y1 - 2018/10/1

N2 - In 1973, the statistician Francis Anscombe used a clever set of bivariate datasets (now known as Anscombe's quartet) to illustrate the importance of graphing data as a component of statistical analyses. In his example, each of the four datasets yielded identical regression coefficients and model fits, and yet when visualized revealed strikingly different patterns of covariation between x and y. Phylogenetic comparative methods (the set of methodologies that use phylogenies, often combined with phenotypic trait data, to make inferences about evolution) are statistical methods too; yet visualizing the data and phylogeny in a sensible way that would permit us to detect unexpected patterns or unanticipated deviations from model assumptions is not a routine component of phylogenetic comparative analyses. Here, we use a quartet of phylogenetic datasets to illustrate that the same estimated parameters and model fits can be obtained from data that were generated using markedly different procedures—including pure Brownian motion evolution and randomly selected data uncorrelated with the tree. Just as in the case of Anscombe's quartet, when graphed the differences between the four datasets are quickly revealed. The intent of this article is to help build the general case that phylogenetic comparative methods are statistical methods and consequently that graphing or visualization should invariably be included as an essential step in our standard data analytical pipelines. Phylogenies are complex data structures and thus visualizing data on trees in a meaningful and useful way is a challenging endeavour. We recommend that the development of graphical methods for simultaneously visualizing data and tree should continue to be an important goal in phylogenetic comparative biology.

AB - In 1973, the statistician Francis Anscombe used a clever set of bivariate datasets (now known as Anscombe's quartet) to illustrate the importance of graphing data as a component of statistical analyses. In his example, each of the four datasets yielded identical regression coefficients and model fits, and yet when visualized revealed strikingly different patterns of covariation between x and y. Phylogenetic comparative methods (the set of methodologies that use phylogenies, often combined with phenotypic trait data, to make inferences about evolution) are statistical methods too; yet visualizing the data and phylogeny in a sensible way that would permit us to detect unexpected patterns or unanticipated deviations from model assumptions is not a routine component of phylogenetic comparative analyses. Here, we use a quartet of phylogenetic datasets to illustrate that the same estimated parameters and model fits can be obtained from data that were generated using markedly different procedures—including pure Brownian motion evolution and randomly selected data uncorrelated with the tree. Just as in the case of Anscombe's quartet, when graphed the differences between the four datasets are quickly revealed. The intent of this article is to help build the general case that phylogenetic comparative methods are statistical methods and consequently that graphing or visualization should invariably be included as an essential step in our standard data analytical pipelines. Phylogenies are complex data structures and thus visualizing data on trees in a meaningful and useful way is a challenging endeavour. We recommend that the development of graphical methods for simultaneously visualizing data and tree should continue to be an important goal in phylogenetic comparative biology.

KW - comparative methods

KW - macroevolution

KW - phylogeny

KW - plotting

KW - visualization

UR - http://www.scopus.com/inward/record.url?scp=85052841574&partnerID=8YFLogxK

U2 - 10.1111/2041-210X.13067

DO - 10.1111/2041-210X.13067

M3 - Article

AN - SCOPUS:85052841574

SN - 2041-210X

VL - 9

SP - 2145

EP - 2154

JO - Methods in Ecology and Evolution

JF - Methods in Ecology and Evolution

IS - 10

ER -

Graphs in phylogenetic comparative analysis: Anscombe's quartet revisited

Abstract

ASJC Scopus subject areas

Zugriff auf Dokument

Andere Dateien und Links

Fingerprint

Dieses zitieren