plotnine.data.anscombe_quartet

anscombe_quartet = pd.read_csv(DATA_DIR / "anscombe-quartet.csv")

Anscombe’s Quartet

Description

A dataset by Statistician Francis Anscombe that challenged the commonly held belief that “numerical calculations are exact, but graphs are rough” (Anscombe, 1973).

It comprises of 4 (the quartet!) small sub-datasets, each with 11 points that have different distributions but nearly identical descriptive statistics. It is perhaps the best argument for visualising data.

Format

A dataframe with 44 rows and 3 variables

Column Description
dataset The Dataset
x x
y y

References

Anscombe, F. J. (1973). “Graphs in Statistical Analysis”. American Statistician. 27 (1): 17–21.