GithubHelp home page GithubHelp logo

grseb9s / colldiag Goto Github PK

View Code? Open in Web Editor NEW

This project forked from brian-lau/colldiag

0.0 1.0 0.0 90 KB

Matlab code for diagnosing collinearity in a regression design matrix

License: GNU General Public License v3.0

MATLAB 100.00%

colldiag's Introduction

colldiag

This download provides a couple of Matlab functions for determining the degree and nature of collinearity in a regression matrix (also termed multicollinearity). Given a design matrix, the condition indices (ratio of largest singular value to each singular value), variance decomposition proportions, and variance inflation factors are returned. Belsley, Kuh, & Welsch [1] suggest a strategy for diagnosing degrading collinearity using the following conditions:

  1. A singular value judged with a large condition index, and which is associated with
  2. Large variance decomposition proportions for two or more covariates

The number of large condition indexes identifies the number of near dependencies among the columns of the design matrix. Large variance decomposition proportions identify covariates that are involved in the corresponding near dependency, and the magnitude of these proportions, in conjunction with the condition index, provides a measure of the degree to which the corresponding regression estimate has been degraded by the presence of collinearity. What is meant by "large" is not statistically precise, although numerical experiments by Belsley et al. indicate that the following ranges are useful:

Condition index Collinearity
5 < CI < 10 weak
30 < CI < 100 moderate to strong
CI > 100 severe
and where a pair (or more) of variance decomposition factors > 0.5 warrant inspection.

The main function prints a summary table to stdout when called without outputs, which may be sufficient to identify problems with smaller design matrices. For models with more covariates, I've included a function to make a collinearity tableplot [2], which allows one to more easily determine the degree of collinearity and pinpoint problematic covariates. More information about tableplots can be found at Michael Friendly's site, where he's posted R software for making these plots.

  1. Belsley, DA, Kuh, E, Welsch, RE (1980). Regression diagnostics: Identifying influential data and sources of collinearity. Wiley
  2. Friendly, M, Kwan, E (2009). Where's Waldo: Visualizing collinearity diagnostics. The American Statistician, 63(1):56-65

Instructions and example

Install the functions under your Matlab path and have a look at demo_colldiag.m. The following will get you started right away:

>> info = colldiag(x); % calculate collinearity diagnostics on design matrix x [nSamples x nVariables]
>> disp(info.str) % print info to stdout
>> colldiag_tableplot(info); % collinearity tableplot

The following figure reproduces an example from section 3.4 of Belsley et al. [2] that uses consumption function data well-known to be ill-conditioned. Drawing

Contributions

Copyright (c) 2014 Brian Lau [email protected], see LICENSE

tight_subplot is copyright Pekka Kumpulainen

colldiag's People

Contributors

brian-lau avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.