GithubHelp home page GithubHelp logo

stepreg's Introduction

StepReg


  • An R package for stepwise regression analysis


How to install

For released version:

install.packages("StepReg")

For development version:

library(devtools)
install_github("JunhuiLi1017/StepReg")

Detailed usage

For released version, refer to CRAN vignettes.

For development version, refer to this vignettes.

Questions?

Please raise an issue here.

stepreg's People

Contributors

junhuili1017 avatar hukai916 avatar xiaohuanlu avatar

Stargazers

guiggs avatar

Watchers

 avatar

Forkers

hukai916

stepreg's Issues

Find the optimal model using step wise linear regression

Thank you for creating the package, it's easy to use and the examples are very useful. I want to perform stepwise linear regression using backward elimination to find the optimal model based on the r-squared. So far, I managed to do that using "brute-force" (please see the code below).

I have a sf object with two columns. The column I am interested in is called fclass. This columns has unique values which I want to use them as predictors. This means that, my baseline model will have all the classes (i.e., the sf object itself). The stepwise model will eliminate the unique values from the fclass column and eventually will print the remaining unique values which yielded the largest r-squared.

My code so far:

library(pacman)
pacman::p_load(terra, sf, dplyr)

ntl <- rast("path/ntl.tif") # response

v <- st_read("path/road.shp") # sf object

# baseline r2
vterra <- vect(v)

ref <- rast("path/pop.tif") # get ext and pixel size

r <- rast(v, res = res(ref), ext = ext(ref))

x <- rasterizeGeom(vterra, r, "length", "m")

x_res <- resample(x, ntl, "average")

s <- c(ntl, x_res)
names(s) <- c("ntl", "road")

# linear model containing all the unique values from the predictor variable
m <- lm(ntl ~ road, s, na.action = na.exclude)
baseline <- sqrt(summary(m)$adj.r.squared)
orig_baseline <- sqrt(summary(m)$adj.r.squared)

classes <- unique(v$fclass)
inclasses <- unique(v$fclass)

i <- 1
while (i <= length(classes)) {
  class <- classes[i]
  print(paste0("current class ", class))
  print(paste0("orig baseline: ", orig_baseline, " - baseline: ", baseline))
  print(classes)
  
  v_filtered <- v[v$fclass != class, ]
  vterra <- vect(v_filtered)
  r <- rast(v, res = res(ref), ext = ext(ref))
  x <- rasterizeGeom(vterra, r, "length", "m")
  
  x_res <- resample(x, ntl, "average")
  
  s <- c(ntl, x_res)
  names(s) <- c("ntl", "road")
  
  m <- lm(ntl ~ road, s, na.action = na.exclude)
  class_r2 <- sqrt(summary(m)$adj.r.squared)
  
  if (class_r2 > baseline) {
    classes <- classes[-i]
    baseline <- class_r2
  } else {
    print("class_r2 is less than baseline")
    print(paste0("class_r2: ", class_r2, " - baseline: ", baseline))
    i <- i + 1
  }
}

but it's not efficient (it takes ~ 5 minutes).

head(v, 6)
Simple feature collection with 6 features and 1 field
Geometry type: MULTILINESTRING
Dimension:     XY
Bounding box:  xmin: 598675.9 ymin: 7111459 xmax: 609432.8 ymax: 7118729
Projected CRS: WGS 84 / UTM zone 35S
         fclass                       geometry
1     secondary MULTILINESTRING ((598675.9 ...
2     secondary MULTILINESTRING ((600641.7 ...
3   residential MULTILINESTRING ((601734.8 ...
4   residential MULTILINESTRING ((601163.9 ...
5   residential MULTILINESTRING ((601104.2 ...
6 motorway_link MULTILINESTRING ((609432.8 ...

The unique values in the fclass column:

unique(v$fclass)
 [1] "secondary"      "residential"    "motorway_link"  "service"        "primary"        "unclassified"   "motorway"      
 [8] "tertiary"       "trunk"          "primary_link"   "footway"        "track"          "secondary_link" "unknown"       
[15] "living_street"  "pedestrian"     "path"           "bridleway"      "steps"          "trunk_link"     "track_grade1"  
[22] "track_grade3"   "track_grade5"   "cycleway"       "track_grade4"   "tertiary_link"  "track_grade2" 

I was wondering if you could help use the stepwise function to perform the regression I described above. You can download the data from GoogleDrive.

Best wishes.

stepwiseLogit invalid 'width' argument

Hello,
I got the following problem when running the stepwiseLogit example from this package.

formula=vs ~ .
StepReg::stepwiseLogit(formula,
data=mtcars,
include=NULL,
selection="bidirection",
select="SL",
sle=0.15,
sls=0.15,
sigMethod="Rao",
weights=NULL,
best=NULL)

Error in format.default(text, width = sum(lengths), justify = "centre") :
invalid 'width' argument

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.