GithubHelp home page GithubHelp logo

fastadaboost's Introduction

fastAdaboost Travis-CI Build Statusdownloads

fastAdaboost is a blazingly fast implementation of adaboost for R. It uses C++ code in the backend to provide an implementation of adaboost that is about 50 times faster than native R based libraries currently available. This is especially useful if your data size is large. fastAdaboost works only for binary classification tasks presently. It implements Freund and Schapire's Adaboost.M1 and Zhu et. al's SAMME.R (real adaboost) algorithms.

Install

Install from CRAN

install.packages("fastAdaboost")

Install from Github:

devtools::install_github("souravc83/fastAdaboost")

Quick Demo

library("fastAdaboost")
set.seed(9999)

num_each <- 1000
fakedata <- data.frame( X=c(rnorm(num_each,0,1),rnorm(num_each,1.5,1)), Y=c(rep(0,num_each),rep(1,num_each) ) )
fakedata$Y <- factor(fakedata$Y)
#run adaboost
test_adaboost <- adaboost(Y~X, fakedata, 10)
#print(A)
pred <- predict( test_adaboost, newdata=fakedata)
print(paste("Adaboost Error on fakedata:",pred$error))
#> [1] "Adaboost Error on fakedata: 0.1225"
print(table(pred$class,fakedata$Y))
#>    
#>       0   1
#>   0 848  93
#>   1 152 907

test_real_adaboost <- real_adaboost(Y~X, fakedata, 10)
pred_real <- predict(test_real_adaboost,newdata=fakedata)
print(paste("Real Adaboost Error on fakedata:", pred_real$error))
#> [1] "Real Adaboost Error on fakedata: 0.1105"
print(table(pred_real$class,fakedata$Y))
#>    
#>       0   1
#>   0 906 127
#>   1  94 873

Performance Benchmarking

How fast is fastAdaboost compared to native R implementations? I used the microbenchmark package to compare the running times of fastAdaboost with Adabag, which is one of the most popular native R based libraries which implements the Adaboost algorithm. The benchmarking indicates that fastAdaboost is about ~45-50 times faster than R based implementation. This is a huge benefit when data sizes are large.

library(microbenchmark)
library(adabag)
library(MASS)

#using fastAdaboost
data(bacteria)
print(
  microbenchmark
  ( 
    boost_obj <- adaboost(y~.,bacteria , 10),
    pred <- predict(boost_obj,bacteria) 
  )
  )
#> Unit: milliseconds
#>                                        expr      min       lq    mean
#>  boost_obj <- adaboost(y ~ ., bacteria, 10) 58.01665 58.69384 60.6658
#>        pred <- predict(boost_obj, bacteria) 26.91593 27.41415 29.5689
#>    median       uq      max neval cld
#>  59.20298 60.13180 74.54155   100   b
#>  27.91902 32.50484 37.58375   100  a

#using adabag
print(
  microbenchmark
  ( 
    adabag_obj <-boosting(y~.,bacteria,boos=F,mfinal=10),
    pred_adabag <- predict(adabag_obj, bacteria)
  )
  )
#> Unit: milliseconds
#>                                                            expr        min
#>  adabag_obj <- boosting(y ~ ., bacteria, boos = F, mfinal = 10) 2497.55208
#>                    pred_adabag <- predict(adabag_obj, bacteria)   34.50564
#>          lq       mean     median         uq       max neval cld
#>  2659.99737 2848.80065 2809.39769 2988.49017 3629.1527   100   b
#>    35.72336   45.21379   37.16913   42.22947  242.7932   100  a

fastadaboost's People

Contributors

souravc83 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

fastadaboost's Issues

rpart.control

In the code for wrap_rpart, rpart_control is automatically set to rpart.control(cp=0)

It would be great to be allowed to modify this parameter and rpart.control in general as an argument in adaboost. Is it possible?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.