GithubHelp home page GithubHelp logo

Comments (5)

tzc1994 avatar tzc1994 commented on August 11, 2024 2

@sfalkner @shukon
Thanks, I have sloved this problem. The resaon is that the loss return from compute function is pytorch cuda tensor, and I use tensor.cpu() and float() to copy this cuda tensor to cpu. Luckily, it worked!

from hpbandster.

sfalkner avatar sfalkner commented on August 11, 2024 1

Make sure all values in the info dictionary are python build-in types as well. That can also lead to workers dying. If your issue is resolved, please close it. Thank you!

from hpbandster.

shukon avatar shukon commented on August 11, 2024

It is always easier (and most of the times only possible at all) to answer, if you provide a StackTrace of the error-message, details of your hardware and environment and a Minimum Working Example (so the error can be reproduced and, if necessary, fixed).

from hpbandster.

tzc1994 avatar tzc1994 commented on August 11, 2024

@shukon Thank you for replying me!
I use this tool to search the numbers of filter of one layer of a CNN network, and I use pytorch as backend.
I use the mnist example and its pytorch code file. To achieve my goal, I modiled my project from mnist examples.
Here is my configSpace:
`@staticmethod
def get_configspace():
"""
It builds the configuration space with the needed hyperparameters.
It is easily possible to implement different types of hyperparameters.
Beside float-hyperparameters on a log scale, it is also able to handle categorical input parameter.
:return: ConfigurationsSpace-Object
"""
cs = CS.ConfigurationSpace()
lr = CSH.UniformFloatHyperparameter('lr', lower=1e-6, upper=1e-1, default_value='1e-2', log=True)
sgd_momentum = CSH.UniformFloatHyperparameter('sgd_momentum', lower=0.0, upper=0.99, default_value=0.9, log=False)
cs.add_hyperparameters([lr, sgd_momentum])
#filter_num1 = CSH.UniformIntegerHyperparameter('filter_num1', lower=16, upper=48, default_value=32, log=True)
#filter_num2 = CSH.UniformIntegerHyperparameter('filter_num2', lower=32, upper=96, default_value=64, log=True)
#filter_num3 = CSH.UniformIntegerHyperparameter('filter_num3', lower=32, upper=96, default_value=64, log=True)
#filter_num4 = CSH.UniformIntegerHyperparameter('filter_num4', lower=64, upper=192, default_value=128, log=True)
#filter_num5 = CSH.UniformIntegerHyperparameter('filter_num5', lower=64, upper=192, default_value=128, log=True)
#filter_num6 = CSH.UniformIntegerHyperparameter('filter_num6', lower=128, upper=384, default_value=256, log=True)
#filter_num7 = CSH.UniformIntegerHyperparameter('filter_num7', lower=128, upper=384, default_value=256, log=True)
#filter_num8 = CSH.UniformIntegerHyperparameter('filter_num8', lower=128, upper=384, default_value=256, log=True)
filter_num9 = CSH.UniformIntegerHyperparameter('filter_num9', lower=128, upper=384, default_value=256, log=True)
#filter_num10 = CSH.UniformIntegerHyperparameter('filter_num10', lower=128, upper=384, default_value=256, log=True)
#filter_num11 = CSH.UniformIntegerHyperparameter('filter_num11', lower=128, upper=384, default_value=256, log=True)
#filter_num12 = CSH.UniformIntegerHyperparameter('filter_num12', lower=256, upper=768, default_value=512, log=True)
#filter_num13 = CSH.UniformIntegerHyperparameter('filter_num13', lower=256, upper=768, default_value=512, log=True)

    cs.add_hyperparameters([filter_num9])
    return cs

`
I just search one layer now.
The info of result as follow:
BUG:hpbandster:DISPATCHER: Finished worker discovery
DEBUG:hpbandster.run_0.worker.ubuntu1604.16378:WORKER: shutting down now!
DEBUG:hpbandster:DISPATCHER: Starting worker discovery
DEBUG:hpbandster:DISPATCHER: Found 0 potential workers, 1 currently in the pool.
INFO:hpbandster:DISPATCHER: removing dead worker, hpbandster.run_0.worker.ubuntu1604.16378140503776417536
INFO:hpbandster:Job (0, 0, 0) was not completed
DEBUG:hpbandster:HBMASTER: number of workers changed to 0
DEBUG:hpbandster:adjust_queue_size: lock accquired
INFO:hpbandster:HBMASTER: adjusted queue size to (-1, 0)
DEBUG:hpbandster:DISPATCHER: job (0, 0, 0) finished
DEBUG:hpbandster:DISPATCHER: Trying to submit another job.
DEBUG:hpbandster:HBMASTER: running jobs: 1, queue sizes: (-1, 0) -> wait
DEBUG:hpbandster:DISPATCHER: jobs to submit = 0, number of idle workers = 0 -> waiting!
DEBUG:hpbandster:DISPATCHER: register_result: lock acquired
DEBUG:hpbandster:DISPATCHER: job (0, 0, 0) on hpbandster.run_0.worker.ubuntu1604.16378140503776417536 finished
DEBUG:hpbandster:job_id: (0, 0, 0)
kwargs: {'config': {'filter_num9': 252, 'lr': 0.029349927740529143, 'sgd_momentum': 0.6664149273388896}, 'budget': 10.0, 'working_directory': '.'}
result: None
exception: Worker died unexpectedly.

DEBUG:hpbandster:job_callback for (0, 0, 0) started
DEBUG:hpbandster:job_callback for (0, 0, 0) got condition
WARNING:hpbandster:job (0, 0, 0) failed with exception
Worker died unexpectedly.
DEBUG:hpbandster:Only 1 run(s) for budget 10.000000 available, need more than 5 -> can't build model!
DEBUG:hpbandster:job_callback for (0, 0, 0) finished
DEBUG:hpbandster:DISPATCHER: Finished worker discovery
DEBUG:hpbandster:DISPATCHER: Starting worker discovery
DEBUG:hpbandster:DISPATCHER: Found 0 potential workers, 0 currently in the pool.
DEBUG:hpbandster:DISPATCHER: Finished worker discovery

Another, I want to konw the maximun number of this tool can search?
Thanks!

from hpbandster.

sfalkner avatar sfalkner commented on August 11, 2024

The message in the second line
WORKER: shutting down now! suggests that your worker script terminated without waiting for any jobs. Could you please post the part of the script responsible for the worker here.
I think I know what the problem is, but want to confirm with your code first.
Thanks!

from hpbandster.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.