Comments (7)
you can add it after loss.backward()
from fewshot_detection.
adding del loss and torch.cuda.empty_cache() solve this problem
from fewshot_detection.
Actually, using empty_cached() leads to really slow GPU operations (60 hours for the fine tuning step). Is there another work around?
If I simply do del loss without emptying the cache, the out of memory error still happens.
from fewshot_detection.
torch.cuda.empty_cache()
hi @quanvuong would you mind elaborating where to add these, much appreciated!
from fewshot_detection.
Hi @quanvuong,
Had you solved this problem ?
I got one similar when evaluating the baseline model, which caused CUDA error: out of memory
due to accumulate the data from each iter.
I used torch 0.4.1 version. Already try to emty_cache()
both del metax, mask
but it doesn't help.
from fewshot_detection.
Hi @quanvuong,
Had you solved this problem ?
I got one similar when evaluating the baseline model, which causedCUDA error: out of memory
due to accumulate the data from each iter.
I used torch 0.4.1 version. Already try toemty_cache()
bothdel metax, mask
but it doesn't help.
In my cases, I used torch v0.4.1 instead of v0.3.1 like the author used. I solved my problem by adding with torch.no_grad()
during validation because volatile
variable in Variable class no longer clear the gradient value, causing accumulated memory in GPU.
from fewshot_detection.
Based on my understanding, there are two reasons for the out-of-memory during tuning
- during the tuning phase, 20 class instead of 15 classes are fed into the re-weighting net. which causes more GPU memory usage.
- During the tuning phase, for multi-scale training, the input images can be as large as 600+. which leads to dynamic memory usage.
The solution could be 1. decrease the batch size a little bit
2. resize the input image size carefully
from fewshot_detection.
Related Issues (20)
- 能否共享一下t-SNE的代码?并分享一下如何使用。感谢
- Which version of cuda is needed
- RuntimeError: The size of tensor a (3) must match the size of tensor b (864) at non-singleton dimension 3 HOT 5
- Strange Code in Dataset.py
- Strange about the meta-learning?
- few-shot training issues
- TypeError: conv2d() received an invalid combination of arguments HOT 6
- map all get 0 HOT 6
- Question about the paper
- Request for learnet module
- Inconsistencies in COCO splits HOT 1
- How long does the training step take? HOT 1
- TypeError: conv2d() received an invalid combination of arguments HOT 1
- ValueError: Expected input batch_size (20) to match target batch_size (8).
- ???
- How to duplicate the results?
- TypeError: only integer tensors of a single element can be converted to an index
- AttributeError:"Easydict" object has no attribute "data"
- 模型训练完成后,推理时间只有3ms,但是box filter却有上百毫秒,请问是为什么呢。 HOT 5
- expected tensor [64 x 32 x 3 x 3] and src [1776] to have the same number of elements, but got 18432 and 1776 elements respectively HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fewshot_detection.