GithubHelp home page GithubHelp logo

Comments (6)

mfeurer avatar mfeurer commented on May 22, 2024 4

The latest version of auto-sklearn features pickleable classifiers/regressors. If there is still an issue with model persistence, please open a new issue.

from auto-sklearn.

mfeurer avatar mfeurer commented on May 22, 2024 3

Have a look at this.

from auto-sklearn.

mfeurer avatar mfeurer commented on May 22, 2024

Yes, that would be useful, but so far it can't. What you can do is use show_models(). It outputs something like:

[(weight, constructor),
 (weight, constructor)]

which determines the final ensemble. You can use that in order to retrain your model on the full data and pickle it in your own code.

from auto-sklearn.

giorgio79 avatar giorgio79 commented on May 22, 2024

Looks like scikit learn uses some external libs
http://stackoverflow.com/questions/10592605/save-classifier-to-disk-in-scikit-learn

from auto-sklearn.

Motorrat avatar Motorrat commented on May 22, 2024

is there a simple programmatic way to convert the output of show_models() into a string that can be used to construct the classifiers in the code? Currently it comes out as

(0.040000, SimpleClassificationPipeline(configuration={
  'balancing:strategy': 'weighting',
  'classifier:__choice__': 'random_forest',
  'classifier:random_forest:bootstrap': 'False',
  'classifier:random_forest:criterion': 'entropy',
  'classifier:random_forest:max_depth': 'None',
  'classifier:random_forest:max_features': 1.6519823800472522,
  'classifier:random_forest:max_leaf_nodes': 'None',
  'classifier:random_forest:min_samples_leaf': 14,
  'classifier:random_forest:min_samples_split': 13,
  'classifier:random_forest:min_weight_fraction_leaf': 0.0,
  'classifier:random_forest:n_estimators': 100,
  'imputation:strategy': 'mean',
  'one_hot_encoding:use_minimum_fraction': 'False',
  'preprocessor:__choice__': 'no_preprocessing',
  'rescaling:__choice__': 'min/max'})),
(0.040000, SimpleClassificationPipeline(configuration={
  'balancing:strategy': 'weighting',
  'classifier:__choice__': 'sgd',
  'classifier:sgd:alpha': 8.157889958167601e-05,
  'classifier:sgd:average': 'False',
  'classifier:sgd:eta0': 0.042599381735495594,
  'classifier:sgd:fit_intercept': 'True',
  'classifier:sgd:learning_rate': 'optimal',
  'classifier:sgd:loss': 'perceptron',
  'classifier:sgd:n_iter': 25,
  'classifier:sgd:penalty': 'l2',
  'imputation:strategy': 'median',
  'one_hot_encoding:minimum_fraction': 0.040130045634589266,
  'one_hot_encoding:use_minimum_fraction': 'True',
  'preprocessor:__choice__': 'no_preprocessing',
  'rescaling:__choice__': 'normalize'})),

from auto-sklearn.

Motorrat avatar Motorrat commented on May 22, 2024

Also show_models() can be very slow and occupies a lot of memory - takes tens of minutes and tens of GB in my case.

Instead I am using
for quality in $(grep obj $ats/log-run*|sed -e 's/^.*obj\ \(.*$\)/\1/'|sort|uniq|head -10); do grep final -A 1 $(grep -l "$quality" $ats/log-run*|sort|head -1); done;
ats=salted_temp_dir_of_autoscklearn
to get top 10 classifiers that were chosen as having best scores from the log files and obviously this virtually takes no time at all.
I wonder if there is a reason show_models does what it does.

from auto-sklearn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.