GithubHelp home page GithubHelp logo

rafael-guerra-www / linearfitxyerrors.jl Goto Github PK

View Code? Open in Web Editor NEW
15.0 1.0 0.0 1.67 MB

Linear Regression with errors in both X and Y, correlated or not, confidence intervals and plots.

License: MIT License

Julia 100.00%
linear regression fit uncertainty confidence-interval errors-in-variables

linearfitxyerrors.jl's Introduction

LinearFitXYerrors.jl

This small Julia package, based on York et al. (2004), performs 1D linear fitting of experimental data with uncertainties in both X and Y:

 Linear fit:             Y = a + b*X                             [1]
        
 Errors:                 X ± σX;  Y ± σY                         [2]

 Errors' correlation:    r =  = cov(σX, σY) / (σX * σY)          [3]

where:

  • X and Y are input data vectors with length ≥ 3
  • Optional standard deviation errors σX and σY are vectors or scalars
  • Optional r is the correlation between the σX and σY errors.
    r can be a vector or scalar

For the σX and σY errors (error ellipses) a bivariate Gaussian distribution is assumed.
If no errors are provided, or if only σX or σY are provided, then equivalent regression results could be obtained using the LsqFit.jl package.

The package computes:

  • The intercept a, the slope b and their uncertainties σa and σb
  • σa95 and σb95: 95%-confidence intervals using a two-tailed t-Student distribution, e.g.: b ± σb95 = b ± t(0.975,N-2)*σb
  • Goodness of fit S (reduced Χ² test): Standard Error of Estimate with Χ² distribution with N-2 degrees of freedom
    S ~ 1: fit consistent with errors, S > 1: poor fit, S >> 1: errors underestimated, S < 1: overfitting or errors overestimated
  • Pearson's correlation coefficient ρ that accounts for data errors
  • Optional display of results with error ellipses and confidence intervals (the latter for no errors case only)

The default argument isplot=false can be turned on to plot the results.
Currently using Plots.jl; gr() is used.

Installation

julia> ] add LinearFitXYerrors
julia> using LinearFitXYerrors

Useage

# The input data and regression results are returned in the fields of the `st` structure (::stfitxy):

st = linearfitxy(X, Y)    # no errors in X and Y, no plot displayed

st = linearfitxy(X, Y; σX, σY, isplot=true)    # X-Y errors non-correlated (r=0); plot with ratio=1

st = linearfitxy(X, Y; σX, σY, r=0, isplot=true, ratio=:auto)  # X-Y errors non-correlated (r=0); plot with auto ratio

Notes:

  • The objective for this first package was to learn how to publish a Julia package via Github while implementing York's technique.
  • Currently the confidence interval "hyperbolic" plot ribbons are only provided for the case where input data have no errors, but in all cases, linear ribbons accounting for the standard deviation of the regression results are produced.
  • The package author is not a statistician and the topics of "errors in variables" and "confidence intervals" are beyond his expertise.
  • While the results seem to be consistent with the references provided, one notable exception is Amen (2012). The latter estimates standard deviation errors for regression in Example-2 that seem to be much smaller than using this package technique (York et al., 2004). However, the input data in that example have large correlated errors and York's solution seems reasonable (tbc).

References:

Altman, D. and Gardner, M. [1988] Statistics in Medicine: Calculating confidence intervals for regression and correlation. British Medical Journal, 296(6631), pp.1238–1242.

Amen, S. [2012] Linear estimation for data with error ellipses. MSc. Statistics, Univ. of Texas

Cantrell, C. [2008] Technical Note: Review of methods for linear least-squares fitting of data and application to atmospheric chemistry problems. Atmospheric Chem. & Physics, 8(17), pp.5477–5487

Mahon, K. [1996] The New “York” Regression: Application of an Improved Statistical Method to Geochemistry. International Geology Review, 38(4), pp.293–303

Reduced Chi-aquared Test: https://en.wikipedia.org/wiki/Reduced_chi-squared_statistic

Regression dilution: https://en.wikipedia.org/wiki/Regression_dilution

York, D. [1966] Least-squares fitting of a straight line. Canadian Journal of Physics, 44(5), pp.1079–1086

York, D. [1969] Least squares fitting of a straight line with correlated errors. Earth and Planetary Science Letters, 5, pp.320–324.

York, D., Evensen, N., Martinez, M. and Delgado J. [2004] Unified equations for the slope; intercept and standard errors of the best straight line. Am. J.Phys. 72 [3]

NB: find the Julia code for the displayed examples below in the repository examples folder

Example-0: no errors in X and Y

Reference: Altman and Gardner (1988)

Example0b_LinearFitXYerrors

Example-1: uncorrelated errors in X and Y

References: York (1966) and Cantrell (2008)

Example1_LinearFitXYerrors

Example-2: correlated errors in X and Y

Reference: Amen (2012)

Example2_LinearFitXYerrors

Example-3: correlated errors in X and Y

Reference: Mahon (1996)

Example3_LinearFitXYerrors

linearfitxyerrors.jl's People

Contributors

rafael-guerra-www avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

linearfitxyerrors.jl's Issues

Setting intercept to 0

Is it possible to set the intercept to 0? I tried using GLM.jl to do the same but i get a collinearity error. I was wondering if there was a way to do it in this package since it does exactly what i want.

Best,
W

A GMT version of linearfitxy

Hi Rafael,

I've ported your linearfitxy function to work in GMT.jl and in the migration process the Distributions and Plots dependencies were dropped (replaced by GMT.jl functions alone).

Taking one of your examples I can now do

D = Misc.linearfitxy([0.0, 0.9, 1.8, 2.6, 3.3, 4.4, 5.2, 6.1, 6.5, 7.4], [5.9, 5.4, 4.4, 4.6, 3.5, 3.7, 2.8, 2.8, 2.4, 1.5], sx=1 ./ sqrt.([1000., 1000, 500, 800, 200, 80,  60, 20, 1.8, 1]), sy=1 ./ sqrt.([1., 1.8, 4, 8, 20, 20, 70, 70, 100, 500]));

>>> [± σ]  Y = (5.4799 ± 0.3592) + (-0.4805 ± 0.0706)*X
>>> [± 95% CI]  Y = (5.4799 ± 0.8284) + (-0.4805 ± 0.1629)*X
>>> Pearson ρ = -0.943;  Goodness of fit = 1.218

# This one was entirely re-written in *GMT.jl* and allows tuning all plotting parameters
plot_linfitxy(D, band_ab=true, band_CI=true, ellipses=true, show=true)

and get the figure bellow.

What I would like to ask you is if you would like to make a PR to GMT.jl with the contents of the ported function (in the zip attached to this post) so that the credits for this work are more openly visible.

Cheers

Joaquim

GMTjl_tmp
linearfitxy.zip

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Kills Kernel

Hi, I'm using this package to display regression error. When I call linearfitxy(x,y; plot=true) it spits out the errors and the Pearson coefficient but then it kills my jupyter kernel and I have to restart.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.