GithubHelp home page GithubHelp logo

one GPU about upsnet HOT 5 OPEN

uber-research avatar uber-research commented on May 28, 2024
one GPU

from upsnet.

Comments (5)

YuwenXiong avatar YuwenXiong commented on May 28, 2024

please follow this #36 (comment)

from upsnet.

gh2517956473 avatar gh2517956473 commented on May 28, 2024

Thank you!

from upsnet.

lfdeep avatar lfdeep commented on May 28, 2024

Thank you!
Hello,i use one gpu,but it occured:
ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).
ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).
ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).
ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).
Traceback (most recent call last):
File "upsnet/upsnet_end2end_train.py", line 414, in
upsnet_train()
File "upsnet/upsnet_end2end_train.py", line 268, in upsnet_train
data, label, _ = train_iterator.next()
File "/root/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 330, in next
idx, batch = self._get_batch()
File "/root/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 309, in _get_batch
return self.data_queue.get()
File "/root/anaconda3/lib/python3.7/multiprocessing/queues.py", line 352, in get
res = self._reader.recv_bytes()
File "/root/anaconda3/lib/python3.7/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/root/anaconda3/lib/python3.7/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/root/anaconda3/lib/python3.7/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
File "/root/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 227, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 31613) is killed by signal: Bus error. Details are lost due to multiprocessing. Rerunning with num_workers=0 may give better error trace.

from upsnet.

lfdeep avatar lfdeep commented on May 28, 2024

Can I use one GPU with 12G memory to train? Where does the code need to change?
Thank you very much!

Hello,Can you run the code successfully on a gpu?

from upsnet.

pkuCactus avatar pkuCactus commented on May 28, 2024

Thank you for great work. what if i use horovod on a single gpu machine?I tried it and found it fast than not use horovod, do this have any problem?Moreover, how could i run multiple horovod worker to mimic multiple gpu on a single gpu machine, thanks a lot. Expect your reply.

from upsnet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.