GithubHelp home page GithubHelp logo

Comments (6)

mfeurer avatar mfeurer commented on May 16, 2024

Yes, kind of. The idea is to build an ensemble of models which will be used to predict for further unseen instances. Is there a reason why you want to know the single best model?

from auto-sklearn.

 avatar commented on May 16, 2024

I think autosklearn needs to support get_params so that we can get its cross validation accuracy and it is then truly a drop-in replacement. As it stands in order to evaluate how well it is doing I will need to manually implement cross validation with it since it is incompatible with the rest of scikit learn's cross validation routines because it does not support get_params.

NotImplementedError: auto-sklearn does not implement get_params() because it is not intended to be optimized.

If auto-sklearn supported get_params and could be pickled (since it takes a long time to train) then we wouldn't need to know the best performing model.

All that said, it would be nice to know which models comprised the ensemble, for sure.

from auto-sklearn.

 avatar commented on May 16, 2024

I created a fork where I tried to get this working. The steps were:

  • Switch to multiprocess https://github.com/uqfoundation/multiprocess
    • Change all sklearn and other code to use multiprocess instead of multiprocessing
  • Switch to dill https://github.com/uqfoundation/dill
    • Change all sklearn and other code to 'import dill as pickle'
  • Monkey patch process.py so that it tries to serialize AuthenticationStrings
  • Modify AutoSklearnClassifier to be pickleable and confirm I could pickle it:
class AutoSklearnClassifier2(AutoSklearnClassifier):
    def get_params(self, deep=False):

        init = {
            'time_left_for_this_task': 10,
            'per_run_time_limit': 360,
            'initial_configurations_via_metalearning': 25,
            'ensemble_size': 50,
            'ensemble_nbest': 50,
            'seed':1,
            'ml_memory_limit':3000
            }

        return init

    def set_params(self, **parameters):
        for parameter, value in parameters.items():
            self.setattr(parameter, value)
        return self

In the end, the code works (I can train and pickle models) but still does not work for cross validation. I can pass it to a scikit cross validator but it freezes after it says Aborting Training!

Your thoughts?.

from auto-sklearn.

mfeurer avatar mfeurer commented on May 16, 2024

Hm, I can't find the changes you propose. Can you make them visible to me?

Besides that, I think there are three things you want to accomplish:

  1. Serialize the configurations found by auto-sklearn
  2. Find out which models are in the final ensemble
  3. Use auto-sklearn in the in cross_val_score and cross_val_predict

and as it reads, AutoSklearn subclassing multiprocessing.Process is the main problem. I think the following changes would enable you to accomplish 1. and 2. without any additional dependencies:

  1. Move loading the pickled models from the predict to the fit method and save them in member variable. This would make 2. possible (with a better repr and better repr for all wrappers in ParamSklearn).
  2. Write a load and a save method instead of using pickle.

Why cv does not work though I don't see right now. It seems like a signal from the runsolver, a program which wraps executing the machine learning algorithm to control memory and runtime, stopped more than it should. Does this happen repeatedly and also in the development branch?

One more remark: Please do not propose changes which add non-standard dependencies with c extensions. We're using auto-sklearn in an environment where we can't deploy them.

from auto-sklearn.

mfeurer avatar mfeurer commented on May 16, 2024

I just pushed version 0.1 to pypi. The AutoSklearnClassifier can now be pickled and the user can directly access the models as well if necessary.

from auto-sklearn.

 avatar commented on May 16, 2024

Nice.

from auto-sklearn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.