The Datasaurus Dozen

Description

A compilation of 13 data sets, each of them including different values for the variables x and y.

Usage

datasaurus

Format

A data frame with 1846 observations of 3 variables.

dataset

The name of each one of the 13 sets. The original datasaurus set is named "dino".

x

Values for x.

y

Values for y.

Source

The data sets were created by Justin Matejka and George Fitzmaurice (see https://www.autodesk.com/research/publications/same-stats-different-graphs), inspired by the datasaurus set from Alberto Cairo (see http://www.thefunctionalart.com/2016/08/download-datasaurus-never-trust-summary.html). The data can also be found in the datasauRus package (License, copyright).

References

Justin Matejka and George Fitzmaurice. 2017. Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing. Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, 1290-1294. DOI: https://doi.org/10.1145/3025453.3025912.

Downloads