GithubHelp home page GithubHelp logo

lindeloev / tests-as-linear Goto Github PK

View Code? Open in Web Editor NEW
477.0 477.0 90.0 47.35 MB

Common statistical tests are linear models (or: how to teach stats)

Home Page: https://lindeloev.github.io/tests-as-linear/

CSS 2.53% JavaScript 65.07% HTML 32.40%

tests-as-linear's People

Contributors

havanagrawal avatar hcientist avatar lindeloev avatar thets avatar thorbjornwolf avatar wildoane avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tests-as-linear's Issues

naming of "Exact?" column

I find the "Exact?" column of the "Common statistical tests are linear models" pdf to be somewhat misleading, since the "Exact?" column links to simulations that show correspondence for sufficiently large n. My concerns would be alleviated if the column were renamed "Correspondence" or "Equivalence".

I really appreciate this project: nice work!

correlation vs linear model

First, just want to say that i love this! So thanks for the work.

Maybe you can add a little note for the section on correlations:
There is a difference between corr(x,y) and lm(y ~ 1 + x). Correlation is commutative corr(x,y) = corr(y,x) but lm(y ~ 1 + x) โ‰  lm(x ~ 1 + y). It's especially relevant when both x and y contain measurement error.

A good reference for this is here:
https://elifesciences.org/articles/00638
The tls package in R provides one option for computing this.

Add links to external material

Following the tweet, I have been made aware of many excellent ressources. This issue just serves to collect them before I add them somewhere.

https://www.middleprofessor.com/files/applied-biostatistics_bookdown/_book/ looks like a solid intro to linear modeling equivalent to the stats 101 models. Downsides: there is little visualization, and no mention of non-parametric (i think?), and a lot more sampling theory. Check if there are worked examples.

https://siminab.github.io/2018/01/10/everything-in-statistical-modeling-can-be-seen-as-a-regression/ contains the basics, but likely too superficial.

https://www.ncbi.nlm.nih.gov/pubmed/20063905 looks like an excellent academic discussion of rote learning vs. modeling.

Is there any python code add on ?

Thanks very much for the R code and explanation of the GLM !
I think pretty cool to let people understand all those statistics in GLM way
Though the R code is easy and clear , is it possible to add python code for reference ?

I list below as those I can find (mostly scipy, , but the syntax is NOT as beautiful as R ...

Y ~ continous x

Y ~ descrete x

Multiple regression : lm(y ~ 1 + x1 + x2 + ...)

rnorm_fixed can be simpler

As seen at the opening source code block in 2 Settings and toy data, rnorm_fixed is a function defined as
rnorm_fixed = function(N, mu = 0, sd = 1) scale(rnorm(N)) * sd + mu. Scaling something only to unscale it right after is confusing; rnorm(N, mean = mu, sd = sd) should do just fine.

One-way ANOVA null hypothesis: B1 or B0?

Maybe I'm misunderstanding it, but shouldn't the one-way ANOVA null hypothesis be $y = \beta_0$, not $y = \beta_1$?

$\beta_0$ is the mean of the first group, no? Is this a case of 0-indexing vs 1-indexing? ๐Ÿ˜„

Broken link in Section 7.

Hi!

In the first paragraph of Section 7, there is a statement:

See this nice introduction to Chi-Square tests as linear models.

for which the link is broken.

I have not been able to find the document elsewhere.

Thanks for this wonderful resource.

Break more assumptions

The simulated data is currently balanced, normal, with approx. equal variances and no correlation. The results should generalize with deviations from these. If this can be implemented in a way that does not obfuscate the real message/argument, it would be an improvement.

Output for Wilcoxon signed-rank test correct?

Hi, thanks for this great resource! I'm working through the book now.

Can I confirm that the p-values published in the table of section '4.1.3 R code: Wilcoxon signed-rank test' are correct? I get different p-values for both the Wilcoxon test and the linear model using signed ranks (0.2628 and 02650 respectively). I have been able to replicate all other tests in the book so far using the toy data set. Thanks.

aov is a wrapper for lm

This is a great cheat sheet and comparison of the methods that you've made. Thanks for taking the time to think about it and write it up!

One small comment....
I'm sure you're aware, but aov is just a wrapper for lm with some specific settings (e.g. Helmut contrasts) and print/summary methods to approximate a classical ANOVA table, so it's be difficult for the models to return something different... the way section 6.1.3 is written at the moment feels a bit like you're surprised that they yield the same thing.

Cheers!

Incorrect results in Table 5.1.4 R code: independent t-test

This site is fantastic - please can I share a couple of minor errors:

In the table, the degrees of freedom for the t test are showing as 48, instead of 98.

The confidence intervals are also mismatching, but I can make them match exactly if the linear model CIs on beta_1 were used (if the directionality were reversed on either the t-test or the lm).

Thank you!

Add explaination of rank() to text

Thanks for this great resource! In the section on Pearson Correlation, what is rank(x)? Did I miss it? If not, I suggest to elaborate on this in the text as I am probably not the only one with this question. Awesome work!

Cheat sheet: Kruskal-Wallis changes

Common name should be Kruskal-Wallis (sheet has an extra L)
Linear Model in Words should read "Same, but it predicts the rank of y" (currently reads "signed rank")

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.