zygmuntz / hyperband Goto Github PK
View Code? Open in Web Editor NEWTuning hyperparams fast with Hyperband
Home Page: http://fastml.com/tuning-hyperparams-fast-with-hyperband/
License: Other
Tuning hyperparams fast with Hyperband
Home Page: http://fastml.com/tuning-hyperparams-fast-with-hyperband/
License: Other
Thanks for putting together this library!
What is the best way to view and interpret the pickled results from one of the runs?
Hi @zygmuntz ,
I am confused about the running process if I run one configuration with two different times of iterations. For example, I have two configurations, named A and B And in the first running process, I run A and B with 5 iterations, respectively. I find config A performs better than B. So in the second running process, I will run A with 10 iterations, which means assigning more resources to A. Hyperband will run A from the point of the end of first running process, or run a from scratch? That is to say, Hyperband will run A with (10 - 5 = 5) iterations or 10 iterations in the second running process ?
I notice that in each try_params
function, it seems that a complete new classifiers with specific running iterations n_iterations
will be created. Such as, https://github.com/zygmuntz/hyperband/blob/master/defs_regression/sgd.py#L57, https://github.com/zygmuntz/hyperband/blob/master/defs_regression/gb.py#L43.
Thanks for your sharing !!!
Ramay7
When I try to run an example I get
IOError: [Errno 2] No such file or directory: 'data/classification.pkl'
Can you post that file somewhere? Or is it derived from one of the other data files?
Thanks
I am not a lawyer (and this is not legal advice), but the current license appears to be either: (1) incompatible with the GPL, or (2) effectively the same as the regular BSD 2-clause since any user could sublicense to whatever government agency he or she so desires. Either way, it's vague and really should be replaced by the regular BSD 2-clause license (or the MIT license or whatever).
Python 2 is nearly dead.
Hi! I am wondering whether it is possible to optimize with cross validation and preferably with a custom scoring function. Currently, it picks the configuration that minimizes e.g., the log loss of the training data if I am not mistaken. Would be good to also have similar options as grid search offers in scikit learn.
Looking over the results from one of my runs, I am seeing both a number of iterations
and a number of runs
? Can you explain the difference between these two?
How would I go about setting up an experiment where there would be 50 pulls of the bandits? I assume that is setting the run
parameter?
Hi,
Thank you for this implementation of Hyperband.
I noticed that in defs_regression, prediction "p" for keras_mlp and rf has shape (n,1) whereas "target" has shape (n,).
I wanted to define my own metric that involved substracting prediction by target at some point. For small arrays it is OK to substract a (n, ) array to a (n, 1) array but for n > 100,000 I got a memory error.
You might want to squeeze prediction p to troubleshoot this problem.
Thank you.
loss = result['loss']
val_losses.append( loss )
Can we replace the loss with auc if I am more interested in auc? Or logloss will be better even if I care about auc?
Hello,
Thanks for this very nice repo ! But something isn't very clear to me. As it is said in the blogpost, Hyperband runs configs for just an iteration or two at first, to get a taste of how they perform. Then it takes the best performers and runs them longer.
So I thought that in the outer loop, we would first randomly instantiate the configuration T
and then update it at the end of each inner loop.
However this is not the case, and for each new s
, a random T
is then again drawn, without taking into account the previous computed T
. Am I missing something here ?
I've been working with your code lately and I've notice that the last layer of the keras_mlp.py in both models does never apply dropout:
model = Sequential()
model.add( Dense( params['layer_1_size'], init = params['init'],
activation = params['layer_1_activation'], input_dim = input_dim ))
for i in range( int( params['n_layers'] ) - 1 ):
extras = 'layer_{}_extras'.format( i + 1 )
if params[extras]['name'] == 'dropout':
model.add( Dropout( params[extras]['rate'] ))
elif params[extras]['name'] == 'batchnorm':
model.add( BatchNorm())
model.add( Dense( params['layer_{}_size'.format( i + 2 )], init = params['init'],
activation = params['layer_{}_activation'.format( i + 2 )]))
model.add( Dense( 1, init = params['init'], activation = 'linear' ))
As can be seen in the code, the last hidden layer can't have dropout since the dropout is coded before the layer itself. Is this intentional or it's undesired behaviour?
Wanting to test this and not having a small data set available is cumbersome.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.