GithubHelp home page GithubHelp logo

paulnorthrop / anscombiser Goto Github PK

View Code? Open in Web Editor NEW
11.0 1.0 0.0 3.02 MB

Create datasets with identical summary statistics

Home Page: https://paulnorthrop.github.io/anscombiser/

R 100.00%
anscombe anscombes-quartet anscombesquartet

anscombiser's Introduction

anscombiser

AppVeyor Build Status R-CMD-check Coverage Status CRAN_Status_Badge Downloads (monthly) Downloads (total)

What does anscombiser do?

Anscombe’s quartet are a set of four two-variable datasets that have several common summary statistics (essentially means, variances and correlation) but which have very different joint distributions. This becomes apparent when the data are plotted, which illustrates the importance of using graphical displays in Statistics. The anscombiser package provides a quick and easy way to create several datasets that have common values for Anscombe’s summary statistics but display very different behaviour when plotted. It does this by transforming (shifting, scaling and rotating) the dataset to achieve target summary statistics.

An example

The mimic() function transforms an input dataset (dino below left) so that it has the same values of Anscombe’s summary statistics as another dataset (trump below right).

library(anscombiser)
library(datasauRus)
dino <- datasaurus_dozen_wide[, c("dino_x", "dino_y")]
new_dino <- mimic(dino, trump)
plot(new_dino, legend_args = list(x = "topright"))
plot(new_dino, input = TRUE, legend_args = list(x = "bottomright"), pch = 20)

In this example these images had similar summary statistics from the outset and therefore the appearance of the dino dataset has changed little. Otherwise, the first dataset will be deformed but its general shape will still be recognisable.

The rotation applied to the input dataset is not unique. The function mimic (and a function anscombise that is specific to Anscombe’s quartet) has an argument idempotent that controls how the rotation is performed. In the special case where the input dataset already has the desired summary statistics, using idempotent = TRUE ensures that the output dataset is the same as the input dataset.

Installation

To get the current released version from CRAN:

install.packages("anscombiser")

Vignette

See vignette("intro-to-anscombiser", package = "anscombiser") for an overview of the package.

anscombiser's People

Contributors

paulnorthrop avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.