GithubHelp home page GithubHelp logo

training error about GC? about fcos HOT 6 CLOSED

tianzhi0549 avatar tianzhi0549 commented on August 17, 2024
training error about GC?

from fcos.

Comments (6)

ZhengyuZhang96 avatar ZhengyuZhang96 commented on August 17, 2024

by the way ,The first time i encountered this error when training to step=1400.
how can i solve it ? thanks !

from fcos.

tianzhi0549 avatar tianzhi0549 commented on August 17, 2024

@studyharderer Please try to set DATALOADER.NUM_WORKERS as 0.

from fcos.

ZhengyuZhang96 avatar ZhengyuZhang96 commented on August 17, 2024

@tianzhi0549 Thank you very much for your reply ! After setting DATALOADER.NUM_WORKERS=0 , it runs to iters=3340 and reports the following error:

Fatal Python error: GC object already tracked

Thread 0x00007f52917fe700 (most recent call first):

Thread 0x00007f52a6cbe700 (most recent call first):

Thread 0x00007f52a813e700 (most recent call first):

Thread 0x00007f5291fff700 (most recent call first):

Current thread 0x00007f5388734740 (most recent call first):
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/site-packages/torchvision/datasets/coco.py", line 115 in getitem
File "/home/zzy/fcos/FCOS/maskrcnn_benchmark/data/datasets/coco.py", line 67 in getitem
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 529 in
File "/home/zzy/anaconda3/envs/FCOS/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 529 in next
File "/home/zzy/fcos/FCOS/maskrcnn_benchmark/engine/trainer.py", line 56 in do_train
File "/home/zzy/fcos/FCOS/tools/train_net.py", line 73 in train
File "/home/zzy/fcos/FCOS/tools/train_net.py", line 167 in main
File "/home/zzy/fcos/FCOS/tools/train_net.py", line 174 in

What does this error mean ? how can i solve it ?

from fcos.

tianzhi0549 avatar tianzhi0549 commented on August 17, 2024

@studyharderer I am not sure why this error happens on your machine. Did you change any of the code? Can you try to use a larger batch size such as 4?

from fcos.

ZhengyuZhang96 avatar ZhengyuZhang96 commented on August 17, 2024

@tianzhi0549 I think I might have found the problem. When I started training, the CPU usage of the server reached more than 400% ...... And I used one 1080ti to take up 60%. How do I reduce the usage of cpu and convert into gpu?
I found that the model in your code uses model.to(device) to use gpu. Is the training data in gpu or cpu? I didn't understand your code about dataloader.

from fcos.

tianzhi0549 avatar tianzhi0549 commented on August 17, 2024

@studyharderer In my opinion, I don't think the error has to do with the CPU or GPU usages. Do your server has limited host memory?

from fcos.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.