GithubHelp home page GithubHelp logo

gurobi / gurobi-machinelearning Goto Github PK

View Code? Open in Web Editor NEW
182.0 13.0 35.0 47.69 MB

Formulate trained predictors in Gurobi models

Home Page: https://gurobi-machinelearning.readthedocs.io/

License: Apache License 2.0

Python 65.46% Makefile 0.08% Jupyter Notebook 34.45%
gurobi machine-learning mathematical-optimization python

gurobi-machinelearning's Introduction

build and test build wheel Python versions Black PyPI ReadTheDocs Gurobi-forum

Gurobi

Gurobi Machine Learning

Gurobi Machine Learning is an open-source python package to formulate trained regression models in a gurobipy model to be solved with the Gurobi solver.

The package currently supports various scikit-learn objects. It has limited support for the Keras API of TensorFlow, PyTorch and XGBoost. Only neural networks with ReLU activation can be used with Keras and PyTorch.

Documentation

The latest user manual is available on readthedocs.

Contact us

For questions related to using Gurobi Machine Learning please use Gurobi's Forum.

For reporting bugs, issues and feature requests please open an issue.

If you encounter issues with Gurobi or gurobipy please contact Gurobi Support.

Installation

Dependencies

gurobi-machinelearning requires the following:

The current version supports the following ML packages:

Installing these packages is only required if the predictor you want to insert uses them (i.e. to insert a Keras based predictor you need to have tensorflow installed).

The up to date supported and tested versions of each package for the last release can be found in the documentation.

Pip installation

The easiest way to install gurobi-machinelearning is using pip in a virtual environment:

(.venv) pip install gurobi-machinelearning

This will also install the numpy, scipy and gurobipy dependencies.

Please note that gurobipy is commercial software and requires a license. When installed via pip or conda, gurobipy ships with a free license which is only for testing and can only solve models of limited size.

Getting a Gurobi License

Alternatively to the bundled limited license, there are licenses that can handle models of all sizes.

As a student or staff member of an academic institution you qualify for a free, full product license. For more information, see:

For a commercial evaluation, you can request an evaluation license.

Other useful resources to get started:

Development

We value any level of experience in using Gurobi Machine Learning and would like to encourage you to contribute directly to this project. Please see the Contributing Guide for more information.

Source code

You can clone the latest sources with the command:

git clone [email protected]:Gurobi/gurobi-machinelearning.git

Testing

After cloning the project, you can run the tests by invoking tox. For this, you will need to create a virtual environment and activate it:

python3.10 -m venv .venv
. .venv/bin/activate

Then, you can install tox (>= 3.26.0) and run a few basic tests:

(.venv) pip install tox
(.venv) tox -e py310,pre-commit,docs

tox will install, among others, the aforementioned ML packages into a separate venv. These packages can be quite large, so this might take a while.

Running the full test set

In the above command, we only ran a subset of tests. Running the full set of tests requires having a Gurobi license installed, and is done by running just the tox command without the -e parameter:

(.venv) pip install tox
(.venv) tox

If you don't have a Gurobi license, you can still run the subset of tests, open a PR, and Github Actions will run the tests with a full Gurobi license.

Submitting a Pull Request

Before opening a Pull Request, have a look at the full Contributing page to make sure your code complies with our guidelines.

gurobi-machinelearning's People

Contributors

bzfwunde avatar davidwalz avatar dependabot[bot] avatar epanemu avatar jaromilnajman avatar mattmilten avatar pobonomo avatar twbraam avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gurobi-machinelearning's Issues

Have an error function to inspect results of optimization

If we have a predictor p that we inserted in a gurobi model m with input variable x and ouput variables ya basic check when looking at the results is that the prediction obtained from x using p is indeed y.
In scikit learn this means checking that:

p.predict(x.X) - y.X

is small.
We should provide the function that computes this error.

Make the decision tree work with more than one output

It looks trivial to do from sklearn code.

The hardest is to write the unittest...

  • Add the code
  • Make a unittest
  • Update documentation (remove limitations 👍 )
  • Add a note somewhere that this is tested but we don't have a use case so one has to be very careful...

Type of variables for classication

When we do classification with logistic regression, setting the output variables to be binary is part of the model. Currently gurobi_ml will set them. But then we don't restore them if we remove the predictor.

Logistic Regression Output

Currently for logistic regression we model predict_proba need to model predict (i.e. classification) also.

  • Add option to the LogisticRegressionConstr
  • Integrate in test (add test for classification)
  • Document it

create build_predictors() for test_adversarial_activations

When you currently run test_formulation::test_adversarial_activations() with sklearn != 1.0.2 you get the following warning:

UserWarning: Trying to unpickle estimator LabelBinarizer from version 1.0.2 when using version 1.1.2. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:
  https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations

All the other predictors are automatically rebuilt if the sklearn version differs from the predictors stored as .joblib files in tests/predictors. This rebuild method should be added to the class harboring test_adversarial_activations

Avoid terminus "member function", just use "method"

The documentation refers to a "member function" in a few places. While technically correct from a OO programming model, this term is rooted deeply in the C++ world. Wouldn't a Pythonist just say "method" for the same thing?

We can deal with all linear regression models of scikit-learn

Currently we only do ordinary least squares (LinearRegression) but the code can also deal with Lasso and Ridge regressions (with l1 and l2 regularization respectively). It's just a matter of associating the correct object to the scikit-learn object.

Speed up model generation for decision trees

Most of the time is spent in generating indicator constraints.
We should try to make that faster and avoid it when possible (bounds already determine branching direction or even generate directly big-Ms).

Output of print_stats

Could be prettier in particular for neural networks and pipeline should have some sort of table

Also check how many layer has a neural network? (hidden layers + 1 or hidden layers +2)

Support pandas input, handle better fixed features

In the price optimization and Janos examples the problem data is in pandas.
While we manage to do things converting to numpy, it would be certainly nicer if the pandas input was accepted directly.

Also, using gurobipy-pandas, we could handle better the case where some of the input for the predictor we put in the optimization model is "fixed". We also have to do this in the two examples and it could be nicer (see Gurobi/gurobipy-pandas#52).

The branch mlinexpr is attempting to start implementing this.

[feature request] support paddle

In China, Baidu's open-sourced ML package PaddlePaddle (19.7k+ stars) is very widely used (China’s No.1 and world’s Top 3). Meanwhile, Gurobi also has many users in China. To maximize the synergistic benefits of gurobi and paddle, I want to build Paddle's based predictor for gurobi-ml. The advantage of using paddle is that it can improve the open source ecosystem of gurobi-ml and attract more. @pobonomo would you be willing to accept a PR?

Formatting issues in the docs

This is a list of things to fix in the documentation

  • Links in the notebooks
  • Find out how the api should really be documented (modules?)
  • Some legal stuff somewhere?
  • Links to how to get a Gurobi license
  • Format links to gurobipy things and other python packages in the same way
  • Output of gurobi in the notebooks (there's an ugly trick now...)
  • Explicitely mention supported versions of sklearn/tensorflow/pytorch

Package name

Matthias says no "-" ot "_" in package names is prefereable.

Support for LightGBM

The optimization model should be similar to other Gradient Boosting but need to figure out how to retrieve the regressor from LigthGBM

Cleanup sklearn's Linear and Logistic Regression

Currently they use the neural network base class.

Historically, because gurobipy.MVar issues there were many tricks that I didn't want to repeat 3 times.
Now that gurobipy is fixed it should actually be very easy to do them separately.

It would be cleaner for the output of what we added and naming of variables.

Fix version of package

It's now set in two places pyproject.toml and __init__.py.

I don't really get yet how this should be done... I'll try something.

Make model generation faster for nested formulations

When we have nested models (for e.g. a gradient boosting regression has 100 decision trees by default), we don't necessarily need all the information about the sub-part models as we don't need to remove any individually (i.e. we don't want to remove individually the decision trees that compose a GBT).

All the recording can take a significant amount of time. One issue is that we won't be able to print any statistics about the sub-parts of the model.

XGBoost constraints don't work with verbose mode

To reproduce run the Janos/XGboost notebook and set verbose=True in the call to add_predictor_constr, you get the error:

            estimators.append(
>               TreeEstimator(
                    self.gp_model,
                    tree,
                    self.input,
                    tree_vars[:, i, :],
                    self.epsilon,
                    timer,
                    self.verbose,
                    **kwargs,
                )
            )
E           TypeError: __init__() got multiple values for argument 'verbose'

Support for PLSRegression from scikit-learn

I would like to implement support for sklearn.cross_decomposition.PLSRegression. It's a linear model so it should be rather straightforward. @pobonomo would you be willing to accept a PR? Any preliminary thoughts on the implementation?

Fixes to get_error

It should really return the absolute value of the difference.

Also documentation is wrong about output

Remove dependency on scikit-learn

I have introduced a small subtle dependency on scikit-learn

For the sklearn regressions we use the function check_is_fitted to check that the regression was fitted.

Now another issue is that the function to register predictor is also included in scikit-learn so that it can be used in pipelines.

Finzalize Examples

We have to straigthen out the examples.
Currently there is:

  • Some examples where we approximate functions and minimize them:
    • parabolas
    • GoldenStein
    • Peak2D
  • The Janos example
  • The price optimization example
  • Adversarial MNIST

I think that I should do a slightly more complex examples with Parabolas and then use it for having one basic example with all regression models
Then we could keep Golden and Peak2D but maybe drop them.
Janos is an issue but it's also the only example we have currently that can use the logistic regression. So removing it is an issue.
Price optimization we should probably change to use different regression models. Definitely it is one we will keep.
MNIST I think is fine. We should just have one also for PyTorch. Ideally we could find some small size pre-trained networks that we could use.

More tests on input

In the unittest, we currently only check the validity of the models for one input type. We should check more of them. I had two bad bugs in the last week.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.