Comments (5)
@sfalkner @shukon
Thanks, I have sloved this problem. The resaon is that the loss return from compute function is pytorch cuda tensor, and I use tensor.cpu() and float() to copy this cuda tensor to cpu. Luckily, it worked!
from hpbandster.
Make sure all values in the info dictionary are python build-in types as well. That can also lead to workers dying. If your issue is resolved, please close it. Thank you!
from hpbandster.
It is always easier (and most of the times only possible at all) to answer, if you provide a StackTrace of the error-message, details of your hardware and environment and a Minimum Working Example (so the error can be reproduced and, if necessary, fixed).
from hpbandster.
@shukon Thank you for replying me!
I use this tool to search the numbers of filter of one layer of a CNN network, and I use pytorch as backend.
I use the mnist example and its pytorch code file. To achieve my goal, I modiled my project from mnist examples.
Here is my configSpace:
`@staticmethod
def get_configspace():
"""
It builds the configuration space with the needed hyperparameters.
It is easily possible to implement different types of hyperparameters.
Beside float-hyperparameters on a log scale, it is also able to handle categorical input parameter.
:return: ConfigurationsSpace-Object
"""
cs = CS.ConfigurationSpace()
lr = CSH.UniformFloatHyperparameter('lr', lower=1e-6, upper=1e-1, default_value='1e-2', log=True)
sgd_momentum = CSH.UniformFloatHyperparameter('sgd_momentum', lower=0.0, upper=0.99, default_value=0.9, log=False)
cs.add_hyperparameters([lr, sgd_momentum])
#filter_num1 = CSH.UniformIntegerHyperparameter('filter_num1', lower=16, upper=48, default_value=32, log=True)
#filter_num2 = CSH.UniformIntegerHyperparameter('filter_num2', lower=32, upper=96, default_value=64, log=True)
#filter_num3 = CSH.UniformIntegerHyperparameter('filter_num3', lower=32, upper=96, default_value=64, log=True)
#filter_num4 = CSH.UniformIntegerHyperparameter('filter_num4', lower=64, upper=192, default_value=128, log=True)
#filter_num5 = CSH.UniformIntegerHyperparameter('filter_num5', lower=64, upper=192, default_value=128, log=True)
#filter_num6 = CSH.UniformIntegerHyperparameter('filter_num6', lower=128, upper=384, default_value=256, log=True)
#filter_num7 = CSH.UniformIntegerHyperparameter('filter_num7', lower=128, upper=384, default_value=256, log=True)
#filter_num8 = CSH.UniformIntegerHyperparameter('filter_num8', lower=128, upper=384, default_value=256, log=True)
filter_num9 = CSH.UniformIntegerHyperparameter('filter_num9', lower=128, upper=384, default_value=256, log=True)
#filter_num10 = CSH.UniformIntegerHyperparameter('filter_num10', lower=128, upper=384, default_value=256, log=True)
#filter_num11 = CSH.UniformIntegerHyperparameter('filter_num11', lower=128, upper=384, default_value=256, log=True)
#filter_num12 = CSH.UniformIntegerHyperparameter('filter_num12', lower=256, upper=768, default_value=512, log=True)
#filter_num13 = CSH.UniformIntegerHyperparameter('filter_num13', lower=256, upper=768, default_value=512, log=True)
cs.add_hyperparameters([filter_num9])
return cs
`
I just search one layer now.
The info of result as follow:
BUG:hpbandster:DISPATCHER: Finished worker discovery
DEBUG:hpbandster.run_0.worker.ubuntu1604.16378:WORKER: shutting down now!
DEBUG:hpbandster:DISPATCHER: Starting worker discovery
DEBUG:hpbandster:DISPATCHER: Found 0 potential workers, 1 currently in the pool.
INFO:hpbandster:DISPATCHER: removing dead worker, hpbandster.run_0.worker.ubuntu1604.16378140503776417536
INFO:hpbandster:Job (0, 0, 0) was not completed
DEBUG:hpbandster:HBMASTER: number of workers changed to 0
DEBUG:hpbandster:adjust_queue_size: lock accquired
INFO:hpbandster:HBMASTER: adjusted queue size to (-1, 0)
DEBUG:hpbandster:DISPATCHER: job (0, 0, 0) finished
DEBUG:hpbandster:DISPATCHER: Trying to submit another job.
DEBUG:hpbandster:HBMASTER: running jobs: 1, queue sizes: (-1, 0) -> wait
DEBUG:hpbandster:DISPATCHER: jobs to submit = 0, number of idle workers = 0 -> waiting!
DEBUG:hpbandster:DISPATCHER: register_result: lock acquired
DEBUG:hpbandster:DISPATCHER: job (0, 0, 0) on hpbandster.run_0.worker.ubuntu1604.16378140503776417536 finished
DEBUG:hpbandster:job_id: (0, 0, 0)
kwargs: {'config': {'filter_num9': 252, 'lr': 0.029349927740529143, 'sgd_momentum': 0.6664149273388896}, 'budget': 10.0, 'working_directory': '.'}
result: None
exception: Worker died unexpectedly.
DEBUG:hpbandster:job_callback for (0, 0, 0) started
DEBUG:hpbandster:job_callback for (0, 0, 0) got condition
WARNING:hpbandster:job (0, 0, 0) failed with exception
Worker died unexpectedly.
DEBUG:hpbandster:Only 1 run(s) for budget 10.000000 available, need more than 5 -> can't build model!
DEBUG:hpbandster:job_callback for (0, 0, 0) finished
DEBUG:hpbandster:DISPATCHER: Finished worker discovery
DEBUG:hpbandster:DISPATCHER: Starting worker discovery
DEBUG:hpbandster:DISPATCHER: Found 0 potential workers, 0 currently in the pool.
DEBUG:hpbandster:DISPATCHER: Finished worker discovery
Another, I want to konw the maximun number of this tool can search?
Thanks!
from hpbandster.
The message in the second line
WORKER: shutting down now!
suggests that your worker script terminated without waiting for any jobs. Could you please post the part of the script responsible for the worker here.
I think I know what the problem is, but want to confirm with your code first.
Thanks!
from hpbandster.
Related Issues (20)
- A bunch of errors when using categoricals
- Interferance between two BOHB instances when they have the same networking parameters? HOT 2
- Why get_incumbent_id selects the incumbent only from runs with max_budget? HOT 1
- How to stop a worker
- How to plot pictures like your work
- Support scipy 1.3+ HOT 1
- Download code button for keras example is not working
- Missing Optimizer
- Minor documentation fixes
- Machine precision problem in determening s_max
- Fit reinforcement learning model with hyperparameter optimization method
- Hyperband bracket generation is inconsistent with the original Hyperband paper HOT 1
- Continuing from running halfway
- Warm start: results.json and configs.json (from live json_result_logger) do not contain results from previous runs HOT 3
- "Ideally, the optimizer would adjust the budgets online" -- when will it be implemented and using what method?
- Issue in case of warmstarting case
- Saving optimization process HOT 1
- conda package? HOT 2
- License file for conda package
- some problem HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hpbandster.