GithubHelp home page GithubHelp logo

qwertz11 / paper_2019_reinbo Goto Github PK

View Code? Open in Web Editor NEW

This project forked from slds-lmu/paper_2019_reinbo

0.0 0.0 0.0 80 KB

AutoML with ReinBo

Home Page: https://github.com/jakob-r/rsmac

R 87.56% Rebol 0.54% Python 10.28% TeX 1.62%

paper_2019_reinbo's Introduction

Introduction

Reinbo is an AutoML solution in R that optimizes machine learning pipeline with Bayesian Optimization embedded Reinforcement Learning.

Read our ECML Auto Data Science workshop paper for more information.

ReinBo: Machine Learning pipeline search and configuration with Bayesian Optimization embedded Reinforcement Learning, Xudong Sun and Jiali Lin and Bernd Bischl, arxiv preprint, https://arxiv.org/abs/1904.05381, 2019

For citation, see reinbo_citation.bib in this repository.

Reproduce the benchmark experiments

The codes to reproduce the experiments in the paper lies in the directory "benchmark".

  • install the required packages through benchmark/install_depend.R
  • learn how to use the R cran package batchtools for large scale benchmark study
  • in folder benchmark, execute main.R, then submit jobs according to "batchtools" API
  • There are in total 600 jobs

Installation of the Package

We also have an R package developed, to install, first you should have the rlR package ready

install rlR through

devtools::install_github("smilesun/rlR")

read the instructions about rlR to make sure it works on your computer.

afterwards,

devtools::install_github("compstat-lmu/paper_2019_ReinBo")

Using ReinBo Package

Load package to library and now ReinBo is ready for you to optimize a pipeline.

library(ReinBo)
best_model = reinbo(task = mlrTask, budget = 1000L, train_set = train_set, custom_operators = NULL)
  • task: the task must be a mlr task (currently only classification task is accepted.)
  • budget: maximum number of pipelines to evaluate
  • train_set: optimization data set index vector
  • custom_operators: set Null to use all default operators for pipeline

A typical ML pipline consists of 3 stages: preprocessing, filtering and classification. Below is a list of the current built-in operators at each stage that come with ReinBo:

  • preprocess: "cpoScale()", "cpoScale(scale = FALSE)", "cpoScale(center = FALSE)", "cpoSpatialSign()", "NA";

  • filter: "cpoFilterAnova(perc)", "cpoFilterKruskal(perc)", "cpoPca(center = FALSE, rank)", "cpoFilterUnivariate(perc)", "NA";

  • classifier: "classif.ksvm", "classif.ranger", "classif.kknn", "classif.xgboost", "classif.naiveBayes";

where "NA" indicates that no operator would be taken at that stage.

Users can also select a subset of operators by setting e.g.:

custom_operators = list(preprocess = c("cpoScale()", "cpoSpatialSign()", "NA"),
                        filter = NULL, # using all filtering operators
                        classifier = c("classif.kknn", "classif.naiveBayes"))

Example

library(ReinBo)
library(mlrCPO)
library(OpenML)
task = convertOMLTaskToMlr(getOMLTask(37))$mlr.task %>>% cpoDummyEncode(reference.cat = FALSE)
split = makeResampleInstance("Holdout", task)
train_set = split$train.inds[[1]]
test_set = split$test.inds[[1]]
best_model = reinbo(task = task, budget = 100L, train_set = train_set, custom_operators = NULL)
print(best_model$mmodel)
##                                                  Model        C     sigma
## 13 cpoScale()\tcpoFilterUnivariate(perc)\tclassif.ksvm 6.976259 -7.312171
##         perc          y
## 13 0.5984765 -0.2304947

y in the result is the negative mmce (mean mis-classification error) of the best model.

paper_2019_reinbo's People

Contributors

jialilin98 avatar smilesun avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.