Comments (6)
Yes, kind of. The idea is to build an ensemble of models which will be used to predict for further unseen instances. Is there a reason why you want to know the single best model?
from auto-sklearn.
I think autosklearn needs to support get_params so that we can get its cross validation accuracy and it is then truly a drop-in replacement. As it stands in order to evaluate how well it is doing I will need to manually implement cross validation with it since it is incompatible with the rest of scikit learn's cross validation routines because it does not support get_params.
NotImplementedError: auto-sklearn does not implement get_params() because it is not intended to be optimized.
If auto-sklearn supported get_params and could be pickled (since it takes a long time to train) then we wouldn't need to know the best performing model.
All that said, it would be nice to know which models comprised the ensemble, for sure.
from auto-sklearn.
I created a fork where I tried to get this working. The steps were:
- Switch to multiprocess https://github.com/uqfoundation/multiprocess
- Change all sklearn and other code to use multiprocess instead of multiprocessing
- Switch to dill https://github.com/uqfoundation/dill
- Change all sklearn and other code to 'import dill as pickle'
- Monkey patch process.py so that it tries to serialize AuthenticationStrings
- Modify AutoSklearnClassifier to be pickleable and confirm I could pickle it:
class AutoSklearnClassifier2(AutoSklearnClassifier):
def get_params(self, deep=False):
init = {
'time_left_for_this_task': 10,
'per_run_time_limit': 360,
'initial_configurations_via_metalearning': 25,
'ensemble_size': 50,
'ensemble_nbest': 50,
'seed':1,
'ml_memory_limit':3000
}
return init
def set_params(self, **parameters):
for parameter, value in parameters.items():
self.setattr(parameter, value)
return self
In the end, the code works (I can train and pickle models) but still does not work for cross validation. I can pass it to a scikit cross validator but it freezes after it says Aborting Training!
Your thoughts?.
from auto-sklearn.
Hm, I can't find the changes you propose. Can you make them visible to me?
Besides that, I think there are three things you want to accomplish:
- Serialize the configurations found by auto-sklearn
- Find out which models are in the final ensemble
- Use auto-sklearn in the in cross_val_score and cross_val_predict
and as it reads, AutoSklearn subclassing multiprocessing.Process is the main problem. I think the following changes would enable you to accomplish 1. and 2. without any additional dependencies:
- Move loading the pickled models from the
predict
to thefit
method and save them in member variable. This would make 2. possible (with a better repr and better repr for all wrappers in ParamSklearn). - Write a
load
and asave
method instead of using pickle.
Why cv does not work though I don't see right now. It seems like a signal from the runsolver, a program which wraps executing the machine learning algorithm to control memory and runtime, stopped more than it should. Does this happen repeatedly and also in the development branch?
One more remark: Please do not propose changes which add non-standard dependencies with c extensions. We're using auto-sklearn in an environment where we can't deploy them.
from auto-sklearn.
I just pushed version 0.1 to pypi. The AutoSklearnClassifier
can now be pickled and the user can directly access the models as well if necessary.
from auto-sklearn.
Nice.
from auto-sklearn.
Related Issues (20)
- Install auto-sklearn in Colab HOT 1
- [Question] Is there a way to configure n_configs of the SMAC's module ?
- calling model.show_models() give error as
- calling model.show_models() give error HOT 2
- Set Preprocess = None, but still get SimpleImputer problem; besides some warnings alongside
- ValueError: (' Dummy prediction failed with run state StatusType.CRASHED and additional output HOT 1
- [Question] TimeSeriesSplit to use auto-sklearn for time series doesn't work
- [Question] What kind of Hyperparameter optimisation technique does AutoSklearn use?
- [Question] How do I fix this issue? HOT 20
- [Question] Where does the Bayesian Optimisation is working for Hyperparameter search? HOT 16
- [Question] Sum of ensemble weights is not equal to 1. How can that happen? HOT 2
- [Question] Time Frame of performance_over_time plot less than actual training time
- [Question] Feature selection
- [Question] How to solve the warning 'Configuration *** not found'? HOT 1
- [Question] Where are the models stored?
- [Question] Clustering Optimization
- ValueError: Unable to configure handler 'distributed_logfile' HOT 2
- Unable to install autosklearn in kaggle HOT 1
- Installation problem. Cythonizing. error: metadata-generation-failed.
- [Question] cross validation strategy/individual cross validation results?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from auto-sklearn.