Comments (6)
The latest version of auto-sklearn features pickleable classifiers/regressors. If there is still an issue with model persistence, please open a new issue.
from auto-sklearn.
Have a look at this.
from auto-sklearn.
Yes, that would be useful, but so far it can't. What you can do is use show_models(). It outputs something like:
[(weight, constructor),
(weight, constructor)]
which determines the final ensemble. You can use that in order to retrain your model on the full data and pickle it in your own code.
from auto-sklearn.
Looks like scikit learn uses some external libs
http://stackoverflow.com/questions/10592605/save-classifier-to-disk-in-scikit-learn
from auto-sklearn.
is there a simple programmatic way to convert the output of show_models() into a string that can be used to construct the classifiers in the code? Currently it comes out as
(0.040000, SimpleClassificationPipeline(configuration={
'balancing:strategy': 'weighting',
'classifier:__choice__': 'random_forest',
'classifier:random_forest:bootstrap': 'False',
'classifier:random_forest:criterion': 'entropy',
'classifier:random_forest:max_depth': 'None',
'classifier:random_forest:max_features': 1.6519823800472522,
'classifier:random_forest:max_leaf_nodes': 'None',
'classifier:random_forest:min_samples_leaf': 14,
'classifier:random_forest:min_samples_split': 13,
'classifier:random_forest:min_weight_fraction_leaf': 0.0,
'classifier:random_forest:n_estimators': 100,
'imputation:strategy': 'mean',
'one_hot_encoding:use_minimum_fraction': 'False',
'preprocessor:__choice__': 'no_preprocessing',
'rescaling:__choice__': 'min/max'})),
(0.040000, SimpleClassificationPipeline(configuration={
'balancing:strategy': 'weighting',
'classifier:__choice__': 'sgd',
'classifier:sgd:alpha': 8.157889958167601e-05,
'classifier:sgd:average': 'False',
'classifier:sgd:eta0': 0.042599381735495594,
'classifier:sgd:fit_intercept': 'True',
'classifier:sgd:learning_rate': 'optimal',
'classifier:sgd:loss': 'perceptron',
'classifier:sgd:n_iter': 25,
'classifier:sgd:penalty': 'l2',
'imputation:strategy': 'median',
'one_hot_encoding:minimum_fraction': 0.040130045634589266,
'one_hot_encoding:use_minimum_fraction': 'True',
'preprocessor:__choice__': 'no_preprocessing',
'rescaling:__choice__': 'normalize'})),
from auto-sklearn.
Also show_models() can be very slow and occupies a lot of memory - takes tens of minutes and tens of GB in my case.
Instead I am using
for quality in $(grep obj $ats/log-run*|sed -e 's/^.*obj\ \(.*$\)/\1/'|sort|uniq|head -10); do grep final -A 1 $(grep -l "$quality" $ats/log-run*|sort|head -1); done;
ats=salted_temp_dir_of_autoscklearn
to get top 10 classifiers that were chosen as having best scores from the log files and obviously this virtually takes no time at all.
I wonder if there is a reason show_models does what it does.
from auto-sklearn.
Related Issues (20)
- Install auto-sklearn in Colab HOT 1
- [Question] Is there a way to configure n_configs of the SMAC's module ?
- calling model.show_models() give error as
- calling model.show_models() give error HOT 2
- Set Preprocess = None, but still get SimpleImputer problem; besides some warnings alongside
- ValueError: (' Dummy prediction failed with run state StatusType.CRASHED and additional output HOT 1
- [Question] TimeSeriesSplit to use auto-sklearn for time series doesn't work
- [Question] What kind of Hyperparameter optimisation technique does AutoSklearn use?
- [Question] How do I fix this issue? HOT 20
- [Question] Where does the Bayesian Optimisation is working for Hyperparameter search? HOT 16
- [Question] Sum of ensemble weights is not equal to 1. How can that happen? HOT 2
- [Question] Time Frame of performance_over_time plot less than actual training time
- [Question] Feature selection
- [Question] How to solve the warning 'Configuration *** not found'? HOT 1
- [Question] Where are the models stored?
- [Question] Clustering Optimization
- ValueError: Unable to configure handler 'distributed_logfile' HOT 2
- Unable to install autosklearn in kaggle HOT 1
- Installation problem. Cythonizing. error: metadata-generation-failed.
- [Question] cross validation strategy/individual cross validation results?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from auto-sklearn.