GithubHelp home page GithubHelp logo

maxhalford / starboost Goto Github PK

View Code? Open in Web Editor NEW
26.0 5.0 10.0 263 KB

:star::rocket: Gradient boosting on steroids

Home Page: https://maxhalford.github.io/starboost/

License: MIT License

Python 100.00%
gradient-boosting machine-learning python scikit-learn

starboost's Introduction

logo


Please check out the website if you're looking for the documentation!

What is this?

This is StarBoost, a Python library that implements gradient boosting. Gradient boosting is an efficient and popular machine learning algorithm used for supervised learning.

Doesn't scikit-learn already do that?

Indeed scikit-learn implements gradient boosting, but the only supported weak learner is a decision tree. In essence gradient boosting can be used with other weak learners than decision trees.

What about XGBoost/LightGBM/CatBoost?

The mentioned libraries are the state of the art of gradient boosting decision trees (GBRT). They implement a specific version of gradient boosting that is tailored to decision trees. StarBoost's purpose isn't to compete with them. Instead it's goal is to implement a generic gradient boosting algorithm that works with any weak learner.

A focus of StarBoost is to keep the code readable and commented, instead of obfuscating the algorithm under a pile of tangled code.

What's a weak learner?

A weak learner is any machine learning model that can learn from labeled data. It's called "weak" because it usually works better as part of an ensemble (such as gradient boosting). Examples are linear models, radial basis functions, decision trees, genetic programming, neural networks, etc. In theory you could even use gradient boosting as a weak learner.

Is it compatible with scikit-learn?

Yes, it is.

How do I install it?

Barring any weird Python setup, you simply have to run pip install starboost.

How do I use it?

The following snippet shows a very basic usage of StarBoost. Please check out the examples directory for comprehensive examples.

from sklearn import datasets
from sklearn import tree
import starboost as sb

X, y = datasets.load_boston(return_X_y=True)

model = sb.BoostingRegressor(
    base_estimator=tree.DecisionTreeRegressor(max_depth=3),
    n_estimators=30,
    learning_rate=0.1
)

model = model.fit(X, y)

y_pred = model.predict(X)

You can find the source code for running the benchmarks here.

What are you planning on doing next?

  • Logging the progress
  • Handling sample weights
  • Implement more loss functions
  • Make it faster
  • Newton boosting (taking into account the information from the Hessian)
  • Learning to rank

By the way, why is it called "StarBoost"?

As you might already know, in programming the star symbol * often refers to the concept of "everything". The idea is that StarBoost can be used with any weak learner, not just decision trees.

License

The MIT License (MIT). Please see the LICENSE file for more information.

starboost's People

Contributors

maxhalford avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

starboost's Issues

base_estimator

Hi, is that I can change the base_estimator to another model, e.g. Linear regression ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.